CN113673286B

CN113673286B - Depth reconstruction method, system, equipment and medium based on target area

Info

Publication number: CN113673286B
Application number: CN202010411573.9A
Authority: CN
Inventors: 朱力; 吕方璐; 汪博
Original assignee: Shenzhen Guangjian Technology Co Ltd
Current assignee: Shenzhen Guangjian Technology Co Ltd
Priority date: 2020-05-15
Filing date: 2020-05-15
Publication date: 2024-04-16
Anticipated expiration: 2040-05-15
Also published as: CN113673286A

Abstract

The invention provides a depth reconstruction method, a system, equipment and a medium based on a target area, which comprise the following steps: acquiring an RGB image and an infrared light spot image of a target face; face detection is carried out on the RGB image to determine a face area in the RGB image; and carrying out depth reconstruction on the target face according to the infrared light spot image and the face area in the RGB image to generate a depth face image. According to the invention, the face detection is carried out on the acquired RGB image, the face region in the RGB image is determined, and the plurality of spot areas corresponding to the face region in the infrared spot image are determined according to the face region, so that only the depth face image of the face region can be generated, the operation amount of depth reconstruction is reduced, and the efficiency of depth reconstruction is improved.

Description

Depth reconstruction method, system, equipment and medium based on target area

Technical Field

The invention relates to the field of 3D imaging, in particular to a depth reconstruction method, a depth reconstruction system, depth reconstruction equipment and a depth reconstruction medium based on a target area.

Background

In recent years, with the development of consumer electronics industry, depth cameras with depth sensing function are receiving attention from consumer electronics industry. The current well-established depth measurement methods are structured light schemes and ToF techniques.

ToF (time of flight) is a 3D imaging technique that emits measurement light from a projector and reflects the measurement light back to a receiver through a target face so that the spatial distance of the object to the sensor can be obtained from the propagation time of the measurement light in this propagation path. Common ToF techniques include single point scanning projection methods and face light projection methods.

The structured light three scheme is based on the principle of optical triangulation. The optical projector projects a certain mode of structured light on the surface of the object, and a light bar three-dimensional image modulated by the surface shape of the measured object is formed on the surface. The three-dimensional image is detected by a camera at another location, thereby obtaining a two-dimensional distorted image of the light bar. The degree of distortion of the light bar depends on the relative position between the optical projector and the camera and the object surface profile (height). Intuitively, the displacement (or offset) displayed along the light bar is proportional to the object surface height, kinks represent changes in plane, and discontinuities represent physical gaps in the surface. When the relative position between the optical projector and the camera is fixed, the three-dimensional shape outline of the object surface can be reproduced by the distorted two-dimensional light bar image coordinates.

The depth camera module widens the dimension of front end perception, can well solve the problems of false body attack resistance and recognition accuracy reduction under extreme conditions encountered by 2D face recognition, has the effect of being accepted by the market, has strong demand, and can be applied to the scenes of door locks, access control, payment and the like based on 3D face recognition. However, when the method is applied to scenes such as payment, the snapshot efficiency of the face image is improved, the quick payment is realized, and no corresponding solution exists in the prior art.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a depth reconstruction method, a system, equipment and a medium based on a target area.

The depth reconstruction method based on the target area provided by the invention comprises the following steps:

step S1: acquiring an RGB image and an infrared light spot image of a target face;

step S2: face detection is carried out on the RGB image to determine a face area in the RGB image;

step S3: and carrying out depth reconstruction on the target face according to the infrared light spot image and the face area in the RGB image to generate a depth face image.

Preferably, the step S1 includes the steps of:

step S101: alternately acquiring an IR image and an infrared spot image of the target face through an infrared camera;

step S102: collecting RGB images of the target face through an RGB camera;

step S103: and acquiring an RGB image, an IR image and an infrared light spot image of the target face, and previewing the RGB image in real time.

Preferably, the step S3 includes the steps of:

step S301: mapping the face area of the RGB image into the infrared light spot image, and determining a plurality of light spots of the face area in the infrared light spot image;

step S302: acquiring a plurality of light spot areas of the face area;

step S303: and generating a depth face image of the face region according to the deformation or displacement of the plurality of light spot areas corresponding to the face region.

Preferably, when the target face is subjected to depth reconstruction to generate a depth face image, the following steps are simultaneously performed:

-quality detecting said RGB image, said IR image;

-performing a live detection of the IR image when the IR image meets a preset quality criterion.

Preferably, when quality detection is performed on the RGB image and the IR image, the method includes the steps of:

step M1: acquiring a preset image quality standard;

step M2: judging whether the RGB image and the IR image accord with preset image quality standards, triggering step M3 when the RGB image and the IR image accord with preset image quality standards, sending out second prompt information when the RGB image and the IR image do not accord with preset image quality standards, and returning to step S1;

step M3: and performing living body detection on the target face according to the IR image.

Preferably, the step S2 includes the steps of:

step S201: face detection is carried out on the RGB image to determine a face area in the RGB image;

step S202: triggering step S203 when the face area is detected in the RGB image and the IR image; when no face area is detected in the RGB image and the IR image, returning to the step S1;

step S203: and (3) performing expression detection on the face area, determining the expression type of the face area, and triggering the step (S3) when the expression type of the face area accords with a preset expression type set.

Preferably, the step S203 includes the steps of:

step S2031: acquiring a preset expression type set;

step S2032: judging whether the expression type is an expression type in a preset expression type set, triggering step S2033 when the expression type is an expression type in the preset expression type set, sending out first prompt information when the expression type is not an expression type in the preset expression type set, and returning to step S1;

step S2033: and screening out RGB images of one expression type in the preset expression type set, and triggering the step S3.

The depth reconstruction system based on the target area provided by the invention comprises the following modules:

the image acquisition module is used for acquiring an RGB image and an infrared light spot image of the target face;

the face detection module is used for carrying out face detection on the RGB image to determine a face area in the RGB image;

and the depth reconstruction module is used for carrying out depth reconstruction on the target face according to the infrared light spot image and the face region in the RGB image to generate a depth face image.

The depth reconstruction device based on the target area provided by the invention comprises:

a processor;

a memory having stored therein executable instructions of the processor;

wherein the processor is configured to perform the steps of the target region based depth reconstruction method via execution of the executable instructions.

According to the present invention, there is provided a computer readable storage medium storing a program which, when executed, implements the steps of the target region-based depth reconstruction method.

Compared with the prior art, the invention has the following beneficial effects:

according to the invention, the face detection is carried out on the acquired RGB image, the face region in the RGB image is determined, and the plurality of spot areas corresponding to the face region in the infrared spot image are determined according to the face region, so that only the depth face image of the face region can be generated, the operation amount of depth reconstruction is reduced, and the efficiency of depth reconstruction is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art. Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:

FIG. 1 is a flow chart of steps of a depth reconstruction method based on a target region according to an embodiment of the present invention;

FIG. 2 is a flowchart showing steps for capturing RGB images, IR images, and IR spot images of a target face according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating steps of face detection according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating a method for detecting a table condition according to an embodiment of the present invention;

FIG. 5 is a flowchart illustrating steps for quality detection of RGB images and IR images according to an embodiment of the present invention;

FIG. 6 is a flowchart illustrating steps for performing depth reconstruction on a target face according to an embodiment of the present invention;

FIG. 7 is a block diagram of a target region based depth reconstruction system according to an embodiment of the present invention;

FIG. 8 is a schematic structural diagram of a depth reconstruction device based on a target region according to an embodiment of the present invention; and

fig. 9 is a schematic diagram of a computer-readable storage medium according to an embodiment of the present invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications could be made by those skilled in the art without departing from the inventive concept. These are all within the scope of the present invention.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.

The invention provides a depth reconstruction method based on a target area, which aims to solve the problems existing in the prior art.

The following describes the technical scheme of the present invention and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.

Fig. 1 is a step flowchart of a depth reconstruction method based on a target area in an embodiment of the present invention, and as shown in fig. 1, the depth reconstruction method based on a target area provided by the present invention includes the following steps:

step S1: acquiring an RGB image, an IR image and an infrared light spot image of a target face;

fig. 2 is a flowchart of steps for acquiring an RGB image, an IR image, and an infrared spot image of a target face in an embodiment of the present invention, as shown in fig. 2, the step S1 includes the following steps:

step S102: collecting RGB images of the target face through an RGB camera;

In the embodiment of the invention, the RGB image, the IR image and the infrared light spot image are carried out by a depth camera;

the depth camera comprises a discrete light beam projector, a surface light source projector, an RGB camera and an infrared camera

The discrete light beam projector is used for projecting a plurality of discrete collimated light beams to a target face;

the surface light source projector is used for projecting floodlight to the target face body;

the infrared camera is used for receiving the scattered collimated light beam reflected by the target face, obtaining an infrared light spot image of the surface of the target face according to the scattered collimated light beam reflected by the target face, receiving floodlight reflected by the target face, and obtaining IR image data of the surface of the target face according to the floodlight reflected by the target face.

The RGB camera is used for collecting RGB images of the target face.

in the embodiment of the present invention, the detected face area may be a pixel range of the face area, where the face area in the RGB image and the IR image is framed, and the face area is actually detected. The expression detection can be performed by an expression detection model based on a neural network.

Fig. 3 is a flowchart of steps of face detection in the embodiment of the present invention, as shown in fig. 3, the step S2 includes the following steps:

In the embodiment of the invention, the face area in the RGB image is determined by a face detection model based on a neural network.

Fig. 4 is a flowchart of a step of table condition detection in the embodiment of the present invention, as shown in fig. 4, the step S3 includes the following steps:

step S2031: acquiring a preset expression type set;

In an embodiment of the present invention, the expression type set includes expression blurriness and smiling. The first prompt message may be head setting, keeping the expression flat, please look forward, etc.

In the embodiment of the invention, when the target face is subjected to depth reconstruction to generate a depth face image, the following steps are simultaneously carried out:

-quality detecting said RGB image, said IR image;

Fig. 5 is a flowchart illustrating steps for quality detection of RGB images and IR images according to an embodiment of the present invention, where, as shown in fig. 5, the quality detection of RGB images and IR images includes the following steps:

step M1: acquiring a preset image quality standard;

In an embodiment of the present invention, the image quality standard may be a contrast threshold, and the contrast threshold may be set to 150: and 1, when the contrast ratio of the RGB image and the IR image is larger than the contrast ratio threshold value, recognizing that the RGB image and the IR image accord with a preset image quality standard.

In the embodiment of the invention, the image quality standard can also adopt a PSNR (Peak Signal to Noise Ratio ) threshold value; the PSNR threshold may be set to 30dB, and when the contrast of the RGB image and the IR image is greater than the PSNR threshold, the RGB image and the IR image are recognized to meet a preset image quality standard.

The second prompt information can improve the exposure time, reduce the exposure time, perform backlight compensation and the like.

In the embodiment of the invention, when the IR image passes through living body detection, living body detection is carried out on the depth face image, and after the depth face image passes through living body detection, a living body face recognition result is output.

In the embodiment of the invention, the face recognition result of the living body can be a depth face image, an IR image and an RGB image which are detected by the living body, and can also be a successful result of the living body detection.

In the embodiment of the invention, when the IR image passes through living body detection, specifically, whether the light spot image is a living body face light spot image or not is judged according to the light spot definition of the IR image, whether the light spot definition is in a preset light spot definition threshold interval is judged, when the light spot definition of the pixel area is in the preset light spot definition threshold interval, the light spot image is judged to be the living body face light spot image, and the light spot definition threshold interval is 10-30. The light spot definition has the following numerical value:c is the total number of pixels in the pixel area, D (f) is the value of the spot definition, and G (x, y) is the value of the center pixel after convolution.

In the embodiment of the invention, the IR image can be detected in vivo by a neural network-based living body detection model, and the depth face image can be detected in vivo by another neural network-based living body detection model.

Fig. 6 is a flowchart of a step of performing depth reconstruction on a target face according to an embodiment of the present invention, where, as shown in fig. 6, the step of performing depth reconstruction on the target face according to the infrared speckle image and the RGB image includes the following steps:

namely, the step S3 includes the steps of:

step S302: extracting a plurality of light spot areas of the face area;

In the embodiment of the invention, the depth face image is produced according to a structured light technology, specifically, the depth face image of the face area is obtained according to deformation or displacement of a light spot area, and the rugged depth information of the face area is obtained.

In the embodiment of the invention, the depth image of the face area can also be obtained through the time delay or the phase difference of a plurality of infrared light spot images, namely, the depth image is calculated through a TOF technology.

When the target area-based depth reconstruction method is realized, the target area-based depth reconstruction method is realized through a Hai Si chip with the model of Hi3516DV300, when the RGB images and the IR images are subjected to face detection through a neural network reasoning engine (NNIE, neural Network Inference Engine), and meanwhile, the infrared light spot images are preprocessed through an intelligent video engine (IVE, intelligent Video Engine) so as to extract a plurality of light spot areas in the infrared light spot images; when the CPU and the intelligent video engine are used for carrying out depth reconstruction on the target face to generate a depth face image, the neural network reasoning engine is used for carrying out quality detection on the RGB image and the IR image and carrying out living body detection on the IR image in sequence.

Fig. 7 is a schematic block diagram of a depth reconstruction system based on a target area according to an embodiment of the present invention, and as shown in fig. 7, the depth reconstruction system based on a target area provided by the present invention includes the following modules:

The embodiment of the invention also provides depth reconstruction equipment based on the target area, which comprises a processor. A memory having stored therein executable instructions of a processor. Wherein the processor is configured to perform the steps of the target region based depth reconstruction method via execution of the executable instructions.

As described above, in this embodiment, face detection is performed on the acquired RGB image, so as to determine a face area in the RGB image, and a plurality of spot areas corresponding to the face area in the infrared spot image are determined according to the face area, so that only a depth face image of the face area can be generated, the computation load of depth reconstruction is reduced, and the efficiency of depth reconstruction is improved.

Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" platform.

Fig. 8 is a schematic structural diagram of a depth reconstruction apparatus based on a target area in an embodiment of the present invention. An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 8. The electronic device 600 shown in fig. 8 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.

As shown in fig. 8, the electronic device 600 is in the form of a general purpose computing device. Components of electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one memory unit 620, a bus 630 connecting the different platform components (including memory unit 620 and processing unit 610), a display unit 640, etc.

Wherein the storage unit stores program code that is executable by the processing unit 610 such that the processing unit 610 performs the steps according to various exemplary embodiments of the present invention described in the above-described target region based depth reconstruction method section of the present specification. For example, the processing unit 610 may perform the steps as shown in fig. 1.

The storage unit 620 may include readable media in the form of volatile storage units, such as Random Access Memory (RAM) 6201 and/or cache memory unit 6202, and may further include Read Only Memory (ROM) 6203.

The storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

Bus 630 may be a local bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or using any of a variety of bus architectures.

The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 600, and/or any device (e.g., router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 650. Also, electronic device 600 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 over the bus 630. It should be appreciated that although not shown in fig. 8, other hardware and/or software modules may be used in connection with electronic device 600, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage platforms, and the like.

The embodiment of the invention also provides a computer readable storage medium for storing a program, and the steps of the target area-based depth reconstruction method are realized when the program is executed. In some possible embodiments, the aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the above description of the target region based depth reconstruction method section, when the program product is run on the terminal device.

As described above, when the program of the computer readable storage medium of this embodiment is executed, the face area in the RGB image is determined by performing face detection on the acquired RGB image, and the plurality of spot areas corresponding to the face area in the infrared spot image are determined according to the face area, so that only the depth face image of the face area can be generated, the computation amount of depth reconstruction is reduced, and the efficiency of depth reconstruction is improved.

Fig. 9 is a schematic structural view of a computer-readable storage medium in an embodiment of the present invention. Referring to fig. 9, a program product 800 for implementing the above-described method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

In the embodiment of the invention, the face area in the RGB image is determined by carrying out face detection on the acquired RGB image, and a plurality of spot areas corresponding to the face area in the infrared spot image are determined according to the face area, so that only a depth face image of the face area can be generated, the operation amount of depth reconstruction is reduced, and the efficiency of depth reconstruction is improved;

according to the invention, the RGB image and the IR image are deeply reconstructed after the expression is optimized, so that failure caused by too poor image quality in the deep reconstruction is avoided, and the depth face image can be generated after the IR image is subjected to living detection, so that the snapshot process can be compactly executed, the time of the whole snapshot process is shortened, and the face brushing payment efficiency is improved.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing describes specific embodiments of the present invention. It is to be understood that the invention is not limited to the particular embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the claims without affecting the spirit of the invention.

Claims

1. The depth reconstruction method based on the target area is characterized by comprising the following steps of:

step S3: performing depth reconstruction on the target face according to the infrared light spot image and the face area in the RGB image to generate a depth face image;

the step S1 includes the steps of:

step S102: collecting RGB images of the target face through an RGB camera;

step S103: acquiring an RGB image, an IR image and an infrared light spot image of a target face, and previewing the RGB image in real time;

the step S2 includes the steps of:

step S203: performing expression detection on the face area, determining the expression type of the face area, and triggering step S3 when the expression type of the face area accords with a preset expression type set; the expression detection is performed through an expression detection model based on a neural network.

2. The target region-based depth reconstruction method according to claim 1, wherein the step S3 comprises the steps of:

step S302: acquiring a plurality of light spot areas of the face area;

3. The depth reconstruction method based on the target area according to claim 1, wherein when the target face is subjected to depth reconstruction to generate a depth face image, the following steps are simultaneously performed:

-quality detecting said RGB image, said IR image;

4. A target region-based depth reconstruction method according to claim 3, comprising the steps of, when quality detecting the RGB image, the IR image:

step M1: acquiring a preset image quality standard;

5. The target region-based depth reconstruction method according to claim 1, wherein the step S203 comprises the steps of:

step S2031: acquiring a preset expression type set;

6. A target region-based depth reconstruction system, comprising the following modules:

the depth reconstruction module is used for carrying out depth reconstruction on the target face according to the infrared speckle image and the face region in the RGB image to generate a depth face image;

the image acquisition module comprises the following steps when in processing:

step S102: collecting RGB images of the target face through an RGB camera;

the face detection module comprises the following steps when in processing:

7. A depth reconstruction device based on a target region, comprising:

a processor;

a memory having stored therein executable instructions of the processor;

wherein the processor is configured to perform the steps of the target region based depth reconstruction method of any one of claims 1 to 5 via execution of the executable instructions.

8. A computer readable storage medium storing a program, wherein the program when executed implements the steps of the target region based depth reconstruction method of any one of claims 1 to 5.