CN112749610A

CN112749610A - Depth image, reference structured light image generation method and device and electronic equipment

Info

Publication number: CN112749610A
Application number: CN202010734333.2A
Authority: CN
Inventors: 洪哲鸣; 王军; 王少鸣; 郭润增
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2021-05-04

Abstract

The application provides a depth image and reference structure light image generation method, a device and electronic equipment, belongs to the technical field of image processing, relates to computer vision, and aims to improve accuracy of a face recognition payment mode. According to the embodiment of the application, the laser projects the structured light beam to the shooting environment; after a shooting environment is shot to obtain a structural light image to be processed, converting the structural light image to be processed into a depth image according to a pre-stored reference structural light image; the reference structured light image is obtained by converting a structured light image shot by an external wide-angle camera through an image mapping matrix between the wide-angle camera and the depth camera. According to the method and the device, the structured light image shot by the wide-angle camera is converted into the reference structured light image, the reference structured light image contains more light spot coding information, and the algorithm black edge caused by the structured light depth algorithm can be optimized when the collected structured light image to be processed is converted into the depth image by the depth camera.

Description

Depth image, reference structured light image generation method and device and electronic equipment

Technical Field

The present disclosure relates to the field of image processing, and in particular, to a method and an apparatus for generating a depth image and a reference structured light image, and an electronic device.

Background

With the development of computer technology, more and more fields need face recognition for users. For example, in the application of the face recognition technology in the payment field, the electronic payment terminal scans the face area of the user, performs face recognition on the user, and completes payment operation after the face recognition is passed.

At present, when a face is recognized, a common method is to collect an image including a face of a user, where the collected image is generally a two-dimensional image, and recognize a face region in the two-dimensional image. However, the method for recognizing the face area in the two-dimensional image cannot determine whether the face in the two-dimensional image is the face of the real user, for example, when the payment is performed through face recognition, the payment terminal may also recognize the face in the collected picture after recognizing the face. Because the depth image is used for representing the distance information between the surface of the shot object and the camera, if the face recognition is carried out through the depth image, whether the shot face is a real face can be judged, and therefore the accuracy of the face recognition is improved. Therefore, a scheme for generating a depth image is needed.

Disclosure of Invention

The embodiment of the application provides a depth image and reference structured light image generation method and device and electronic equipment, which are used for improving the accuracy of a payment mode of face recognition.

In a first aspect, an embodiment of the present application provides a depth image generation method, including:

projecting a structured light beam into a shooting environment through a laser;

after the depth camera shoots the shooting environment to obtain a structural light image to be processed, converting the structural light image to be processed into a depth image according to a pre-stored reference structural light image; the pre-stored reference structured light image is obtained by converting a structured light image shot by a wide-angle camera through an image mapping matrix between the wide-angle camera and the depth camera; the structured light image shot by the wide-angle camera is obtained by shooting a reference plane in a shooting environment in which the laser projects a structured light beam; the wide angle camera has a field of view greater than the depth camera.

In a second aspect, an embodiment of the present application provides a reference structured light image generation method, including:

projecting a structured light beam into a shooting environment through a laser;

the depth camera obtains a wide-angle camera to shoot a reference plane in the shooting environment to obtain a structured light image;

the depth camera converts the structured light image shot by the wide-angle camera into a reference structured light image matched with the depth camera according to an image mapping matrix between the wide-angle camera and the depth camera; wherein a field angle of the wide-angle camera is greater than a field angle of the depth camera.

In a third aspect, an embodiment of the present application provides a depth image generating apparatus, including:

a first control unit for projecting a structured light beam into a shooting environment by a laser;

the first conversion unit is used for converting the structured light image to be processed into a depth image according to a pre-stored reference structured light image after the shooting environment is shot to obtain the structured light image to be processed; the pre-stored reference structured light image is obtained by converting a structured light image shot by a wide-angle camera through an image mapping matrix between the wide-angle camera and a depth camera; the structured light image shot by the wide-angle camera is obtained by shooting a reference plane in a shooting environment in which the laser projects a structured light beam; the wide angle camera has a field of view greater than the depth camera.

Optionally, the first conversion unit is specifically configured to:

aiming at each pixel point in the structured light image to be processed, determining a similar pixel point corresponding to the pixel point from the preset reference structured light image according to a block matching algorithm; determining the parallax between the pixel point and the corresponding similar pixel point according to the coordinate of the pixel point in the structural light image to be processed and the coordinate of the similar pixel point corresponding to the pixel point in the preset reference structural light image; determining a depth value corresponding to the pixel point according to the parallax and the depth information of the reference structured light image;

and generating a depth image corresponding to the structural light image to be processed according to the depth value corresponding to each pixel point in the structural light image to be processed.

Optionally, the first conversion unit is specifically configured to:

determining a pixel block with a preset size by taking the pixel point as a center from the structural light image to be processed;

determining a similarity between the pixel block and a plurality of candidate pixel blocks of the preset size included in the preset reference structured light image;

and the depth camera selects a target pixel block from a plurality of candidate pixel blocks included in the preset reference structured light image according to the similarity, and takes a central pixel point of the target pixel block as a similar pixel point corresponding to the pixel point.

In a fourth aspect, an embodiment of the present application provides a reference structured light image generating apparatus, including:

a second control unit for projecting the structured light beam into the shooting environment by the laser;

the acquisition unit is used for acquiring a reference plane in the shooting environment shot by the wide-angle camera to obtain a structured light image;

the second conversion unit is used for converting the structured light image shot by the wide-angle camera into a reference structured light image matched with the depth camera according to an image mapping matrix between the wide-angle camera and the depth camera; wherein a field angle of the wide-angle camera is greater than a field angle of the depth camera.

Optionally, the second conversion unit is configured to determine an image mapping matrix between the wide-angle camera and the depth camera according to the following manner:

shooting a target reference object to obtain a first plane image containing the target reference object, and acquiring a second plane image containing the target reference object, which is obtained by shooting the target reference object by the wide-angle camera;

and determining an image mapping matrix between the wide-angle camera and the depth camera according to the pixel coordinates of the plurality of characteristic points of the target reference object in the first plane image and the pixel coordinates of the plurality of characteristic points of the target reference object in the second plane image.

In a fifth aspect, an embodiment of the present application provides an electronic device, including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a depth image generation method provided herein or a reference structured light image generation method provided herein.

In a sixth aspect, embodiments of the present application provide a computer-readable medium storing computer-executable instructions for performing the depth image generation method provided by the present application or performing the reference structured light image generation method provided by the present application.

The application has the beneficial effects that:

when the terminal equipment needs to identify the face, after the depth camera collects the structural light image to be processed, the structural light image to be processed can be converted into a depth image based on a pre-stored reference structural light image, the terminal equipment identifies the face according to the depth image generated by the depth camera, whether the face is a real face can be determined, and the accuracy of identification can be improved by identifying based on the three-dimensional depth image. In addition, when the pre-stored reference structured light image is determined, the structured light image shot by the wide-angle camera is obtained after being converted, and the wide-angle camera has a larger field angle, so that the structured light image shot by the wide-angle camera can contain more light spot information of the laser light beam, after the structured light image shot by the wide-angle camera is converted into the reference structured light image and stored, the pre-stored reference structured light image contains more light spot coding information, and when the collected structural light image to be processed is converted into the depth image by the depth camera, the algorithm black margin caused by the structured light depth algorithm can be optimized.

Drawings

Fig. 1 is a schematic diagram of an optional application scenario in an embodiment of the present application;

fig. 2 is a schematic diagram of another optional application scenario in the embodiment of the present application;

FIG. 3 is a schematic diagram of a display interface of an electronic payment terminal according to an embodiment of the present application;

fig. 4 is a schematic view of a display interface of an electronic payment terminal in an embodiment of the present application;

fig. 5 is a schematic flow chart of a depth image generation method in an embodiment of the present application;

FIG. 6 is a schematic diagram of a structured light depth map algorithm in an embodiment of the present application;

FIG. 7 is a flowchart of a method for generating a reference structured light image according to an embodiment of the present application;

FIG. 8A is a first planar image of a chessboard captured by a depth camera in an embodiment of the present application;

fig. 8B is a second plane image of the chessboard captured by the wide-angle camera in the embodiment of the present application;

FIG. 9 is a schematic diagram illustrating a method for determining depth values corresponding to pixels in a structured light image to be processed according to an embodiment of the present disclosure;

fig. 10 is a schematic view of a display interface of an electronic payment terminal in an embodiment of the present application;

fig. 11 is a schematic view of a display interface of an electronic payment terminal in an embodiment of the present application;

FIG. 12 is a complete flow chart of a payment control method in an embodiment of the present application;

fig. 13 is a schematic structural diagram of a depth image generating apparatus in an embodiment of the present application;

fig. 14 is a schematic structural diagram of another depth image generation apparatus in the embodiment of the present application;

FIG. 15 is a schematic structural diagram of a reference structured light image generating apparatus according to an embodiment of the present application;

fig. 16 is a schematic structural diagram of an electronic device in an embodiment of the present application;

fig. 17 is a schematic structural diagram of a computing device in an embodiment of the present application.

Detailed Description

In order to make the technical solutions disclosed in the present application better understood, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

Some terms appearing herein are explained below:

1. depth map: in 3D computer graphics and computer vision, a depth map is an image or image channel that represents distance information between the surface of a photographed object and an image capturing device. Each pixel point of the depth map represents the vertical distance between the image acquisition equipment and the object to be shot, and is usually represented by 16 bits in unit of millimeter.

2. Structured light: a beam projected by a laser or projector into an environment; after the structured light beam is projected to the surface of the object, the structured light beam is collected by the image collecting equipment, and the position or the depth of the object can be determined according to the change of the light signal caused by the object.

3. Depth map algorithm black edge: in the structured light depth algorithm, a black edge of a depth map is caused by the depth map algorithm; in real-time depth calculation, when a matching region cannot be retrieved from the reference structured light image, the depth value is assigned to 0, thereby forming a black edge.

4. And (3) binarization processing: the image binarization processing is to set the gray value of a pixel point in an image to be 0 or 1 by selecting a reasonable threshold value; the selection mode of the threshold may be based on a global threshold or a local threshold, etc.

5. A server: the cloud server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like.

The following briefly introduces the design concept of the embodiments of the present application:

computer Vision technology (CV) is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or is transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.

With the continuous development of computer vision technology, face recognition operation can be performed in more and more scenes. In the related art, when face recognition is performed through a terminal device, the terminal device can acquire a two-dimensional image containing a face of a user, and recognize a face region in the two-dimensional image. However, by recognizing the face in the two-dimensional image, the terminal device cannot judge whether the photographed face is a real face, which may result in low accuracy of face recognition.

In view of this, embodiments of the present application provide a depth image generation method, apparatus, electronic device, and computer storage medium, where a structured light beam is projected into a shooting environment through a laser, and then a depth camera shoots the shooting environment to obtain a to-be-processed structured light image, and the to-be-processed structured light image is converted into a depth image according to a pre-stored reference structured light image; the reference structured light image stored in advance is obtained by converting the structured light image shot by the wide-angle camera through an image mapping matrix between the wide-angle camera and the depth camera. Therefore, when the terminal equipment needs to identify the face, after the depth camera collects the structural light image to be processed, the structural light image to be processed can be converted into a depth image based on the pre-stored reference structural light image, the terminal equipment identifies the face according to the depth image generated by the depth camera, whether the face is a real face can be determined, and the accuracy of identification can be improved based on the three-dimensional depth image. In addition, when the pre-stored reference structured light image is determined, the structured light image shot by the wide-angle camera is obtained after being converted, and the wide-angle camera has a larger field angle, so that the structured light image shot by the wide-angle camera can contain more light spot information of the laser light beam, after the structured light image shot by the wide-angle camera is converted into the reference structured light image and stored, the pre-stored reference structured light image contains more light spot coding information, and when the collected structural light image to be processed is converted into the depth image by the depth camera, the algorithm black margin caused by the structured light depth algorithm can be optimized.

After introducing the design concept of the embodiment of the present application, some simple descriptions are provided below for application scenarios to which the technical solution of the embodiment of the present application can be applied, and it should be noted that the application scenarios described below are only used for describing the embodiment of the present application and are not limited. In a specific implementation process, the technical scheme provided by the embodiment of the application can be flexibly applied according to actual needs.

Fig. 1 is a schematic diagram of an exemplary application scenario according to an embodiment of the present application, which is a payment scenario based on face recognition, and includes an electronic payment terminal 10, a server 20, a payment user 30, and a depth camera 40 externally connected to or built in the electronic payment terminal; wherein the depth camera 40 comprises a laser.

Projecting a structured light beam to a shooting environment through a laser, and shooting the shooting environment by a depth camera 40 to obtain a structured light image to be processed; the depth camera 40 converts the captured structural light image to be processed into a depth image according to a pre-stored reference structural light image. The reference structured light image pre-stored in the depth camera 40 is obtained by converting an image mapping matrix between the wide-angle camera and the depth camera 40 according to the structured light image shot by the wide-angle camera, and the determination manner of the reference structured light image stored in the depth camera 40 can be referred to the scene shown in fig. 2.

In an optional embodiment, the electronic payment terminal 10 projects a structured light beam into the shooting environment through a laser after receiving a payment initiation signal triggered by a superior terminal (for example, a PC device installed with a client for receiving and paying) initiating a payment operation; and the electronic payment terminal 10 prompts the payment user 30 to start face payment. The depth camera 40 shoots the shooting environment, and since the payment user 30 is in the shooting environment when the face of the payment user is paid, the depth camera 40 can shoot a to-be-processed structured light image containing the face of the payment user 30.

The depth camera 40 converts the structured light image to be processed into a depth image, and then sends the converted depth image to the electronic payment terminal 10; in an alternative embodiment, the depth camera 40 may further photograph the photographing environment to obtain an RGB image and an infrared image including the payment user 30, and transmit the converted depth image, RGB image and infrared image to the electronic payment terminal 10.

The electronic payment terminal 10 performs face recognition according to the received depth image, RGB image and infrared image; alternatively, the electronic payment terminal 10 sends the received depth image, RGB image and infrared image to the server 20, the server 20 performs face recognition according to the received depth image, RGB image and infrared image, and returns the recognition result to the electronic payment terminal 10. Wherein, after the identification is passed, the electronic payment terminal 10 is informed of the result of passing the identification, and the electronic payment terminal 10 executes the subsequent payment operation; after the server 20 determines that the recognition fails, the electronic payment terminal 10 is notified of the result of the recognition failure, and the electronic payment terminal 10 prompts the payment user 30 to make a face payment again or prompts the payment user 30 that the payment fails.

Fig. 2 is a schematic diagram of an application scenario of generating a reference structured light image according to an embodiment of the present application, which includes a depth camera 40, a wide-angle camera 50, and a reference plane 60; wherein the depth camera 40 comprises a laser.

When determining the reference structured light image stored in the depth camera 40, projecting a structured light beam into a shooting environment through a laser, shooting a reference plane 60 in the shooting environment by the wide-angle camera 50, after the depth camera 40 acquires the structured light image shot by the wide-angle camera 50, converting the acquired structured light image shot by the wide-angle camera 50 into the reference structured light image according to an image mapping matrix between the wide-angle camera 50 and the depth camera 40, and storing the converted reference structured light image into the depth camera 40.

It should be noted that the laser projecting the structured light beam into the shooting environment during the generation of the reference structured light image and the laser projecting the structured light beam into the shooting environment during the generation of the depth image are the same laser in the depth camera.

In the payment process, after the electronic payment terminal determines to start the payment operation, the electronic payment terminal projects the structured light beam to the shooting environment through the laser; prompting a payment user to start face payment in a display interface of the electronic payment terminal; as shown in the display interface of the electronic payment terminal in fig. 3, "start face payment" is displayed in the display interface of the electronic payment terminal.

Then, the depth camera shoots a shooting environment to obtain a structural light image to be processed; after the electronic payment terminal prompts a user to start face payment, the payment user can adjust the position of the payment user to be in a shooting environment of the depth camera according to prompt information in a display interface of the electronic payment terminal; one optional mode is that a preview image containing a payment user can be displayed in a display interface of the electronic payment terminal, and the payment user can adjust the position of the payment user according to the preview image in the display interface of the electronic payment terminal, so that the face area of the payment user is positioned in the shooting environment of the depth camera; for example, as shown in fig. 4, a display interface of the electronic payment terminal in which a preview image of the payment user is displayed.

In the following, a depth image generation method provided by an exemplary embodiment of the present application is described with reference to fig. 5 in conjunction with the application scenario described above. It should be noted that the above application scenarios are only presented to facilitate understanding of the spirit and principles of the present application, and the embodiments of the present application are not limited in this respect. Rather, embodiments of the present application may be applied to any scenario where applicable.

As shown in fig. 5, for a schematic depth image generation flow provided in an embodiment of the present application, the method may include the following steps:

step S51: projecting a structured light beam into a shooting environment through a laser;

step S52: and after the depth camera shoots the shooting environment to obtain a structural light image to be processed, converting the structural light image to be processed into a depth image according to a pre-stored reference structural light image.

The reference structured light image stored in advance is obtained by converting a structured light image shot by a wide-angle camera through an image mapping matrix between the wide-angle camera and a depth camera; the structured light image shot by the wide-angle camera is obtained by shooting a reference plane in a shooting environment in which the laser projects a structured light beam; the field angle of the wide-angle camera is greater than the field angle of the depth camera.

Because the depth camera projects the structured light beams to the shooting environment through the laser, the shooting environment is shot to obtain the structured light image to be processed. After the to-be-processed structured light image is obtained through shooting, the depth camera converts the to-be-processed structured light image into a depth image according to a pre-stored reference structured light image.

The embodiment of the application can adopt various modes when generating the reference structured light image which is stored in the depth camera in advance;

alternatively, the depth camera captures a reference plane in the capture environment to obtain a reference structured light image.

Because the field angle of the depth camera is limited, the light spot coding information of the structured light beam contained in the reference structured light image shot by the depth camera is less, and more algorithm black edges can be generated when the structured light image to be processed is converted into the depth image according to the reference structured light image shot by the depth camera by adopting a structured light depth map algorithm;

as shown in the schematic diagram of the algorithm principle of the structured light depth map shown in fig. 6, the laser projects a structured light beam into the shooting environment at the RX position in fig. 6, and the depth camera shoots a reference plane in the environment at the TX position in fig. 6 to obtain a reference structured light image; the distance between a reference plane in the shooting environment and the depth camera is a preset value; the range of the structured light beam included in the reference structured light image captured by the depth camera is shown as line segment a in fig. 6.

In the process of generating the depth image of the electronic payment terminal, assuming that a payment user carries out face payment in a long distance as shown in a position B, the range of a structured light beam contained in a to-be-processed structured light image which contains a face region of the payment user and is collected by a depth camera is larger than the range of a structured light beam contained in a reference structured light image, when the to-be-processed structured light image is converted into the depth image according to the reference structured light image, part of pixel points in the to-be-processed structured light image cannot be determined in the reference structured light image, and when the to-be-processed structured light image is converted into the depth image, the pixel values of the part of pixel points are assigned to be 0, so that the converted depth image contains a black edge. Or if the payment user performs face payment in a short distance shown in the position C, the structured light beam corresponding to the partial region of the to-be-processed structured light image including the face region of the payment user, which is acquired by the depth camera, is not in the structured light beam included in the reference structured light image, when the to-be-processed structured light image is converted into the depth image according to the reference structured light image, part of pixel points in the to-be-processed structured light image cannot be determined in the reference structured light image, and when the to-be-processed structured light image is converted into the depth image, the pixel values of the part of pixel points are assigned to 0, so that the converted depth image also includes a black edge.

An alternative way of generating the reference structured light image is also provided in the embodiments of the present application, such as the schematic flow chart of generating the reference structured light image shown in fig. 7, where the method may include the following steps:

step S71: projecting a structured light beam into a shooting environment through a laser;

step S72: the method comprises the steps that a depth camera obtains a wide-angle camera to shoot a reference plane in a shooting environment to obtain a structured light image;

step S73: the depth camera converts a structured light image captured by the wide-angle camera into a reference structured light image matched with the depth camera according to an image mapping matrix between the wide-angle camera and the depth camera.

The laser that projects the structured light beam during the generation of the reference structured light image is the same laser that projects the structured light beam during the generation of the depth image.

Wherein the field angle of the wide-angle camera is greater than the field angle of the depth camera;

for example, the resolution of the structured light image captured by the depth camera may be 1920 × 800; the structured light image captured by the wide-angle camera is a large-size structured light image, and the resolution of the captured structured light image may be 4000 × 3000.

The process of generating the reference structured light image may be referred to as a calibration process for the depth camera.

In the process of generating the reference structured light image, the structured light beam is projected to the shooting environment through the laser, so that the structured light image can be obtained when a reference plane in the shooting environment is shot through the wide-angle camera; the distance between the reference plane and the wide-angle camera is preset, and the wide-angle camera shoots the reference plane in the shooting environment, so that the depth information of each pixel point of the obtained reference structured light image is the same.

In the process of generating the depth image, the depth camera shoots the structural light image to be processed, and the structural light image to be processed shot by the depth camera is converted into the depth image according to the reference structural light image, so that the reference structural light image is required to be matched with the depth camera;

after the structured light image shot by the wide-angle camera is acquired, the structured light image shot by the wide-angle camera needs to be converted into a reference structured light image matched with the depth camera according to an image mapping matrix between the wide-angle camera and the depth camera;

it should be noted that the image mapping matrix between the wide-angle camera and the depth camera is substantially caused by the fact that the wide-angle camera and the depth camera are at different positions when capturing images.

In an optional implementation manner, the image mapping matrix between the wide-angle camera and the depth camera may be determined according to the following manner:

the method comprises the steps that a depth camera shoots a target reference object to obtain a first plane image containing the target reference object, and a second plane image containing the target reference object and obtained by shooting the target reference object by a wide-angle camera is obtained; the depth camera determines an image mapping matrix between the wide-angle camera and the depth camera according to the pixel coordinates of the plurality of feature points of the target reference object in the first plane image and the pixel coordinates in the second plane image.

Wherein one optional target reference is a chessboard; a first planar image of the chessboard captured by the depth camera as shown in fig. 8A, and a second planar image of the chessboard captured by the wide angle camera as shown in fig. 8B.

The method comprises the steps that a depth camera is supposed to shoot a target reference object to obtain a plane image a, and a wide-angle camera is used to shoot the target reference object to obtain a plane image b; a plurality of feature points are selected from the target reference object (for example, when the target reference object is a checkerboard, the plurality of feature points may be vertices of a checkerboard in the plane image).

And (3) constructing a quadratic fit curve according to the pixel coordinates of the same characteristic point in the plane image a and the plane image b respectively:

x_r＝x₀+dx

y_r＝y₀+dy

wherein (x)_r,y_r) Pixel coordinates of pixel points in a planar image a taken by a depth camera, (x)₀,y₀) Pixel coordinates of pixel points in a plane image b shot by a wide-angle camera, and dx and dy are offsets of the pixel coordinates in the two plane images corresponding to the same characteristic point;

wherein the coordinate offset dx, dy is defined as (x)₀,y₀) The quadratic function of (d):

substituting pixel coordinates of a plurality of characteristic points in two plane images into the quadratic fit curve to solve the unknown parameter a_x，b_x，c_x，d_x，e_x，f_x，a_y，b_y，c_y，d_y，e_y，f_y。

Determining an image mapping matrix between the wide-angle camera and the depth camera according to the solved unknown parameters;

the image mapping matrix between the wide-angle camera and the depth camera comprises a mapping matrix corresponding to an abscissa and a mapping matrix corresponding to an ordinate;

wherein, the mapping matrix corresponding to the abscissa may be

The mapping matrix corresponding to the ordinate may be

Wherein x is_r1、x_r2…x_rnAbscissa, y, of pixel points in a planar image a taken by a depth camera_r1、y_r2…y_rnOrdinate, x, of a pixel point in a planar image a taken by a depth camera₀₁、x₀₂…x_0nAbscissa, y, of pixel points in a planar image b taken by a wide-angle camera₀₁、y₀₂…y_0nAnd (3) the vertical coordinates of pixel points in the plane image b shot by the wide-angle camera.

According to the method and the device, when a structured light image shot by a wide-angle camera is converted into a reference structured light image matched with a depth camera according to an image mapping matrix between the wide-angle camera and the depth camera, each pixel point in the structured light image shot by the wide-angle camera is mapped according to the image mapping matrix between the wide-angle camera and the depth camera, and the reference structured light image matched with the depth camera is obtained;

in implementation, for the pixel coordinate of each pixel point in the structured light image shot by the wide-angle camera, the pixel coordinate of the pixel point mapped to the reference structured light image is obtained through the following formula:

wherein x is_m1、x_m2…x_mnIs the abscissa, y, of a pixel point in the reference structured-light image_m1、y_m2…y_mnIs a reference to the ordinate, x, of a pixel point in the structured-light image_l1、x_l2…x_lnAbscissa, y, of pixel points in structured-light images taken for wide-angle cameras_l1、y_l2…y_lnThe vertical coordinates of the pixel points in the structured light image shot by the wide-angle camera.

In the payment process, when the depth camera shoots a structural light image to be processed, the structural light image to be processed is converted into a depth image according to a pre-stored reference structural light image; the process of converting the structured-light image to be processed into a depth image is described in detail below.

Determining a depth value corresponding to each pixel point in the structured light image to be processed according to a pre-stored reference structured light image; generating a depth image corresponding to the structural light image to be processed according to the depth value corresponding to each pixel point in the structural light image to be processed;

when the structural light image to be processed is converted into the depth image according to the pre-stored reference structural light image, the pre-stored reference structural light image and the pre-processed structural light image are subjected to binarization processing;

the binarization processing mode can adopt a global threshold algorithm, a local threshold algorithm, a dynamic threshold algorithm, a Niblack algorithm, a P-quantile algorithm, an iterative algorithm, an entropy method algorithm, a maximum inter-class variance algorithm and the like, and the pixel gray level of each pixel in the pre-stored reference structured light image and the to-be-processed structured light image is set to be 0 or 1.

For example, taking the global threshold algorithm as an example, assuming that the resolution of the structured light image is 1920 × 800, the average brightness value of all pixels in the structured light image is determined, and when the structured light image is subjected to binarization processing, pixels larger than the average brightness value are marked as 1, and pixels smaller than the average brightness value are marked as 0.

According to the structured light image to be processed after binarization processing, determining corresponding similar pixel points of each pixel point in the structured light image to be processed after binarization processing in the reference structured light image after binarization processing according to a block matching algorithm;

an optional implementation manner is that for each pixel point in the structured light image to be processed after binarization processing, a pixel block with a preset size and with the pixel point as the center is determined from the structured light image to be processed; determining the similarity between the pixel block and a plurality of candidate pixel blocks with preset sizes included in the reference structured light image after binarization processing; selecting a target pixel block from a plurality of candidate pixel blocks included in a preset reference structured light image according to the similarity, and taking a central pixel point of the target pixel block as a similar pixel point corresponding to the pixel point;

for example, assuming that a pixel block with a preset size is a 3 × 3 pixel block, a plurality of candidate pixel blocks with a preset size included in the binarized reference structured light image may determine all the candidate pixel blocks with 3 × 3 included in the binarized reference structured light image in a window sliding manner; calculating the similarity between each pixel block and the alternative pixel block in the structured light image to be processed after binarization processing; and selecting the candidate pixel block with the highest similarity with the pixel block in the structural light image to be processed from the candidate pixel blocks as a target pixel block, wherein the central pixel point of the target pixel block is a similar pixel point corresponding to the pixel point in the structural light image to be processed.

After determining pixel points in the structured light image to be processed after binarization processing, and after determining similar pixel points in the reference structured light image after binarization processing, determining the parallax between the pixel points and the corresponding similar pixel points;

the parallax between the pixel point and the corresponding similar pixel point is the coordinate offset between the coordinate of the pixel point in the structural light image to be processed and the coordinate of the similar pixel point in the reference structural light image.

After determining the parallax between the pixel point and the corresponding similar pixel point, determining the depth value corresponding to the pixel point in the structural light image to be processed according to the parallax and the depth information of the reference structural light image;

a selectable manner for determining depth values corresponding to pixel points in the structured light image to be processed is shown in fig. 9, where, taking a pixel point a in the structured light image to be processed as an example, a similar pixel point in the reference structured light image corresponding to the pixel point a is a pixel point b, and it is known that the pixel point a and the pixel point b correspond to the same structured light beam in the shooting environment; assume that pixel a corresponds to location a in the shooting environment shown in fig. 9, and pixel a corresponds to location B in the shooting environment shown in fig. 9;

the position a corresponds to a 'shown in fig. 9 when the depth camera performs imaging, and the position a corresponds to B' shown in fig. 9 when the light field camera performs imaging, and then the distance d1 between a 'and B' is the determined parallax between the pixel point a and the pixel point B;

the baseline distance d2 between TX and RX is the distance between the position where the laser projects the structured light beam and the position where the depth camera acquires the image;

the distance R between the position B and a straight line where TX and RX are located is depth information of the reference structured light image, and may be a preset distance between the wide-angle camera and a reference plane when the wide-angle camera shoots the structured light image;

then, the known quantities in fig. 9 are: distance d1 between position B and distances R, A 'and B' of the straight line where TX and RX are located, baseline distance d2 between TX and RX, focal length f of the depth camera;

then according to the triangle similarity principle:

d3/d1＝R/f；

d3/d2＝(H-R)/H；

according to the above formula, H can be solved as the distance between the depth cameras at position a, i.e. the depth value corresponding to the pixel point a.

According to the method and the device, after the depth value corresponding to each pixel point in the structured light image to be processed after binarization is determined, the depth image corresponding to the structured light image to be processed is generated.

In a payment scene based on face recognition, after a depth camera generates a depth image, the depth image, an acquired RGB image and an infrared image are sent to an electronic payment terminal; the electronic payment terminal executes corresponding payment operation based on the face recognition result;

in implementation, an optional implementation manner is that the electronic payment terminal may perform face recognition according to the depth image, the RGB image, and the infrared image, and determine a recognition result of the face recognition;

or, in another optional implementation, the electronic payment terminal may send the depth image, the RGB image, and the infrared image to the remote server, the remote server performs face recognition, and notifies the electronic payment terminal of a recognition result that the recognition is passed, and the electronic payment terminal performs a subsequent payment operation; if the remote server fails to recognize the face of the depth image, the electronic payment terminal is informed of the recognition result of the recognition failure, and the electronic payment terminal determines that the payment fails.

After the electronic payment terminal determines that the face recognition is passed and completes the corresponding payment operation, prompt information of 'payment success' shown in fig. 10 can be displayed in a display interface; after determining that the face recognition is not passed, the electronic payment terminal may display a prompt message of "payment failure" in the display interface as shown in fig. 11.

As shown in fig. 12, taking a payment scenario based on face recognition as an example, a flowchart of a payment control method according to an embodiment of the present application is shown, which includes the following steps:

step S121, projecting a structured light beam to a shooting environment through a laser;

s122, shooting the shooting environment by the depth camera to obtain a structural light image to be processed;

step S123, the depth camera acquires a pre-stored reference structured light image;

the pre-stored reference structured light image is obtained by converting a structured light image shot by a wide-angle camera through an image mapping matrix between the wide-angle camera and a depth camera;

step S124, the depth camera converts the structured light image to be processed into a depth image according to a pre-stored reference structured light image;

step S125, the depth camera sends the depth image, the RGB image and the infrared image to the electronic payment terminal;

step S126, the electronic payment terminal sends the depth image, the RGB image and the infrared image to a server;

s127, the server performs face recognition according to the depth image, the RGB image and the infrared image;

step S128, the server sends the face identification result to the electronic payment terminal;

and S129, the electronic payment terminal executes corresponding depth image generation operation according to the received face recognition result.

Based on the same inventive concept, the embodiment of the present application further provides a depth image generation apparatus, and as the principle of the apparatus for solving the problem is similar to the depth image generation method, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not repeated.

As shown in fig. 13, a schematic structural diagram of a depth image generating apparatus 1300 provided in an embodiment of the present application includes:

a first control unit 1301 for projecting a structured light beam into a shooting environment by a laser;

the first conversion unit 1302 is configured to, after a structural light image to be processed is obtained by shooting a shooting environment, convert the structural light image to be processed into a depth image according to a pre-stored reference structural light image; the pre-stored reference structured light image is obtained by converting a structured light image shot by a wide-angle camera through an image mapping matrix between the wide-angle camera and a depth camera; the structured light image shot by the wide-angle camera is obtained by shooting a reference plane in a shooting environment in which the laser projects a structured light beam; the field angle of the wide-angle camera is greater than the field angle of the depth camera.

Optionally, as shown in fig. 14, the apparatus 1300 further includes:

a second control unit 1303 for projecting the structured light beam into the shooting environment by a laser;

a first obtaining unit 1304, configured to obtain a structured light image obtained by shooting a reference plane in a shooting environment by a wide-angle camera;

a second conversion unit 1305, configured to convert the structured-light image captured by the wide-angle camera into a reference structured-light image matched with the depth camera according to the image mapping matrix between the wide-angle camera and the depth camera.

Optionally, the second conversion unit 1305 is specifically configured to determine an image mapping matrix between the wide-angle camera and the depth camera according to the following manner:

shooting a target reference object to obtain a first plane image containing the target reference object, and acquiring a second plane image containing the target reference object, which is obtained by shooting the target reference object by a wide-angle camera;

Optionally, the first conversion unit 1302 is specifically configured to:

determining a similar pixel point corresponding to the pixel point from a preset reference structured light image according to a block matching algorithm aiming at each pixel point in the structured light image to be processed; determining the parallax between the pixel point and the corresponding similar pixel point according to the coordinate of the pixel point in the structural light image to be processed and the coordinate of the similar pixel point corresponding to the pixel point in the preset reference structural light image; determining a depth value corresponding to the pixel point according to the parallax and the depth information of the reference structured light image;

Optionally, the first conversion unit 1302 is specifically configured to:

determining a pixel block with a preset size by taking a pixel point as a center from a structural light image to be processed;

determining similarity between a pixel block and a plurality of candidate pixel blocks with preset sizes included in a preset reference structured light image;

and selecting a target pixel block from a plurality of candidate pixel blocks included in a preset reference structured light image according to the similarity, and taking a central pixel point of the target pixel block as a similar pixel point corresponding to the pixel point.

Based on the same inventive concept, the embodiment of the present application further provides a reference structured light image generation apparatus, and since the principle of the apparatus for solving the problem is similar to that of the reference structured light image generation method, the implementation of the apparatus can refer to the implementation of the method, and repeated details are not repeated.

As shown in fig. 15, a schematic structural diagram of a reference structured light image generating apparatus 1400 provided in the embodiment of the present application includes:

a third control unit 1501 for projecting a structured light beam into the shooting environment by a laser;

a second obtaining unit 1502, configured to obtain a structured light image obtained by shooting a reference plane in a shooting environment by a wide-angle camera;

a third conversion unit 1503, configured to convert the structured light image captured by the wide-angle camera into a reference structured light image matched with the depth camera according to the image mapping matrix between the wide-angle camera and the depth camera; wherein the field angle of the wide-angle camera is greater than the field angle of the depth camera.

Optionally, the third conversion unit 1503 is specifically configured to determine an image mapping matrix between the wide-angle camera and the depth camera according to the following manner:

For convenience of description, the above parts are separately described as modules (or units) according to functional division. Of course, the functionality of the various modules (or units) may be implemented in the same one or more pieces of software or hardware when implementing the present application.

As will be appreciated by one skilled in the art, each aspect of the present application may be embodied as a system, method or program product. Accordingly, each aspect of the present application may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

In some possible implementations, embodiments of the present application also provide an electronic device, and referring to fig. 16, an electronic device 1600 may include at least one processor 1601 and at least one memory 1602. Wherein the memory 1602 stores program code, which, when executed by the processor 1601, causes the processor 1601 to perform the steps in the depth image generation method according to various exemplary embodiments of the present application described above in this specification, for example, the processor 1601 may perform the steps as shown in fig. 5; or cause the processor 1601 to perform the steps in the reference structured light image generation methods according to various exemplary embodiments of the present application described above in this specification, for example, the processor 1601 may perform the steps as shown in fig. 7.

In some possible implementations, the present application further provides a computing device, which may include at least one processing unit and at least one storage unit. Wherein the storage unit stores program code which, when executed by the processing unit, causes the processing unit to perform the steps in the depth image generating method according to various exemplary embodiments of the present application described above in this specification, for example, the processor 1601 may perform the steps as shown in fig. 5; or cause the processing unit to perform the steps in the reference structured light image generation methods according to various exemplary embodiments of the present application described above in the present specification, for example, the processor 1601 may perform the steps as shown in fig. 7.

A computing device 1700 according to this embodiment of the present application is described below with reference to fig. 17. Computing device 1700 of FIG. 17 is only one example and should not be taken to limit the scope of use or functionality of embodiments of the present application.

As with fig. 17, computing device 1700 is embodied in the form of a general purpose computing device. Components of computing device 1700 may include, but are not limited to: the at least one processing unit 1701, the at least one memory unit 1702, and the bus 1703 that couples the various system components including the memory unit 1702 and the processing unit 1701.

Bus 1703 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus architectures.

The storage unit 1702 may include a readable medium in the form of volatile memory, such as Random Access Memory (RAM)1721 or cache memory unit 1722, and may further include Read Only Memory (ROM) 1723.

The memory unit 1702 may also include a program/utility 1725 having a set (at least one) of program modules 1724, such program modules 1724 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Computing apparatus 1700 may also communicate with one or more external devices 1704 (e.g., keyboard, pointing device, etc.), with one or more devices that enable a user to interact with computing apparatus 1700, or with any device (e.g., router, modem, etc.) that enables computing apparatus 1700 to communicate with one or more other computing apparatuses. Such communication may occur via input/output (I/O) interfaces 1705. Moreover, the computing device 1700 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), or a public network, such as the internet) via the network adapter 1706. As shown, the network adapter 1706 communicates with other modules for the computing device 1700 via a bus 1703. It should be understood that although not shown, other hardware or software modules may be used in conjunction with computing device 1700, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

In some possible embodiments, each aspect of the depth image generation method provided by the present application may also be implemented in the form of a program product including program code for causing a computer device to perform the steps in the depth image generation method according to various exemplary embodiments of the present application described above in this specification when the program product is run on the computer device, for example, the computer device may perform the steps as shown in fig. 5.

In some possible embodiments, each aspect of the reference structured light image generation method provided by the present application may also be implemented in the form of a program product, which includes program code for causing a computer device to perform the steps in the reference structured light image generation method according to various exemplary embodiments of the present application described above in this specification when the program product is run on a computer device, for example, the computer device may perform the steps as shown in fig. 7.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A depth image generation method, characterized by comprising:

projecting a structured light beam into a shooting environment through a laser;

2. The method of claim 1, wherein the reference structured light image is generated according to:

projecting a structured light beam into a shooting environment by the laser;

the depth camera acquires a structured light image obtained by shooting a reference plane in the shooting environment by the wide-angle camera;

and the depth camera converts the structured light image shot by the wide-angle camera into a reference structured light image matched with the depth camera according to an image mapping matrix between the wide-angle camera and the depth camera.

3. The method of claim 2, wherein the image mapping matrix between the wide-angle camera and the depth camera is determined according to:

the depth camera shoots a target reference object to obtain a first plane image containing the target reference object, and a second plane image containing the target reference object and obtained by shooting the target reference object by the wide-angle camera is obtained;

the depth camera determines an image mapping matrix between the wide-angle camera and the depth camera according to the pixel coordinates of the plurality of feature points of the target reference object in the first plane image and the pixel coordinates of the plurality of feature points of the target reference object in the second plane image.

4. The method of claim 1, wherein the depth camera converts the to-be-processed structured-light image into a depth image from a pre-stored reference structured-light image, comprising:

aiming at each pixel point in the structured light image to be processed, the depth camera determines a similar pixel point corresponding to the pixel point from the preset reference structured light image according to a block matching algorithm; determining the parallax between the pixel point and the corresponding similar pixel point according to the coordinate of the pixel point in the structural light image to be processed and the coordinate of the similar pixel point corresponding to the pixel point in the preset reference structural light image; determining a depth value corresponding to the pixel point according to the parallax and the depth information of the reference structured light image;

and the depth camera generates a depth image corresponding to the structured light image to be processed according to the depth value corresponding to each pixel point in the structured light image to be processed.

5. The method of claim 4, wherein the depth camera determines similar pixel points corresponding to the pixel points from the preset reference structured light image according to a block matching algorithm, comprising:

the depth camera determines a pixel block with a preset size by taking the pixel point as a center from the structural light image to be processed;

the depth camera determines a similarity between the pixel block and a plurality of candidate pixel blocks of the preset size included in the preset reference structured light image;

6. A reference structured light image generation method, comprising:

projecting a structured light beam into a shooting environment through a laser;

7. The method of claim 6, wherein the image mapping matrix between the wide-angle camera and the depth camera is determined according to:

8. A depth image generation apparatus, characterized by comprising:

9. The apparatus of claim 8, wherein the apparatus further comprises:

a second control unit for projecting a structured light beam into a shooting environment by the laser;

the first acquisition unit is used for acquiring a reference plane in the shooting environment shot by the wide-angle camera to obtain a structured light image;

and the second conversion unit is used for converting the structured light image shot by the wide-angle camera into a reference structured light image matched with the depth camera according to an image mapping matrix between the wide-angle camera and the depth camera.

10. The apparatus of claim 9, wherein the second conversion unit is specifically configured to determine the image mapping matrix between the wide-angle camera and the depth camera according to:

11. A reference structured light image generating apparatus, comprising:

a third control unit for projecting the structured light beam into the shooting environment by a laser;

the second acquisition unit is used for acquiring a reference plane in the shooting environment shot by the wide-angle camera to obtain a structured light image;

a third conversion unit, configured to convert the structured light image captured by the wide-angle camera into a reference structured light image matched with the depth camera according to an image mapping matrix between the wide-angle camera and the depth camera; wherein a field angle of the wide-angle camera is greater than a field angle of the depth camera.

12. The apparatus of claim 11, wherein the third conversion unit is specifically configured to determine the image mapping matrix between the wide-angle camera and the depth camera according to:

13. A reference structured light image generation system comprising a wide angle camera, a depth camera, the depth camera comprising a laser;

the depth camera is used for projecting structured light beams to a shooting environment through a laser to acquire structured light images shot by the wide-angle camera; converting the structured-light image to a reference structured-light image that matches the depth camera according to an image mapping matrix between the wide-angle camera and the depth camera;

the wide-angle camera is used for shooting a reference plane in the shooting environment to obtain a structured light image;

wherein a field angle of the wide-angle camera is greater than a field angle of the depth camera.

14. An electronic device, comprising a processor and a memory, wherein the memory stores program code which, when executed by the processor, causes the processor to carry out the steps of the method of any of claims 1 to 5 or causes the processor to carry out the steps of the method of claim 6 or 7.

15. Computer readable storage medium, characterized in that it comprises program code for causing an electronic device to carry out the steps of the method of any of claims 1-5 or to carry out the steps of the method of claim 6 or 7, when said program product is run on the electronic device.