CN110853087B

CN110853087B - Parallax estimation method, device, storage medium and terminal

Info

Publication number: CN110853087B
Application number: CN202010035690.XA
Authority: CN
Inventors: 陈俊逸
Original assignee: Changsha Xiaogu Technology Co ltd
Current assignee: Changsha Xiaogu Technology Co ltd
Priority date: 2020-01-14
Filing date: 2020-01-14
Publication date: 2020-04-28
Anticipated expiration: 2040-01-14
Also published as: CN110853087A

Abstract

The invention discloses a parallax estimation method, a parallax estimation device, a storage medium and a terminal, wherein the method comprises the following steps: obtaining a legal parallax signal from the parallax signal set to generate a parallax value mask image; inserting a parallax signal guide module into a preset binocular parallax estimation model to generate an inserted binocular parallax estimation model; inputting the expanded legal parallax value mask image into the inserted binocular parallax estimation model to generate a legal parallax value; generating a weighted function value corresponding to the legal parallax value based on the Gaussian function model and a preset weight calculation formula; generating a weighted disparity matching loss value based on the weighted function value; and inputting the weighted parallax matching loss value into the inserted binocular parallax estimation model to generate a parallax estimation image guided by a parallax signal. Therefore, by adopting the embodiment of the invention, the scene adaptability of the binocular disparity estimation method can be effectively solved due to the fact that the binocular disparity estimation scheme guided by the sparse and high-reliability disparity signal is provided.

Description

Parallax estimation method, device, storage medium and terminal

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a disparity estimation method, apparatus, storage medium, and terminal.

Background

Accurate parallax estimation is a technical basis of scene depth estimation, and the scene depth estimation has important application in the fields of robot navigation, industrial precision measurement, scene reconstruction, virtual reality, augmented reality, object identification and the like. The disparity estimation can be obtained by binocular, structured light, time of flight, and the like. Compared with the structured light and time-flight scheme, the binocular disparity estimation has the excellent characteristics of adaptability to indoor and outdoor scenes, light resistance, large disparity estimation range and the like, and meanwhile, the binocular scheme is simple in structure and low in cost. The invention discloses a binocular dense parallax estimation method based on sparse parallax signal guidance aiming at a binocular scheme.

In the early binocular disparity estimation method, information such as local region color and structure in an image is acquired by using a camera, and regions consistent in a left camera and a right camera are matched, so that disparity is estimated. Due to the fact that the color of the image and the local structure information are directly used, the light resistance and noise resistance are weak, and the parallax estimation quality is poor. Later, by using local color coding features with some light immunity, the disparity estimation is made more robust to illumination variations, some noise. Meanwhile, the parallax estimation quality is further improved by using image information of image areas of different sizes such as global, semi-global and the like.

In recent years, a binocular disparity estimation method based on a convolutional neural network has achieved breakthrough progress, and disparity estimation accuracy and speed are improved compared with the traditional method. However, when the existing binocular disparity estimation method based on the convolutional neural network is migrated to different data sets and scenes, serious disparity estimation accuracy loss occurs, even failure occurs, and thus application of the binocular disparity estimation method based on the convolutional neural network in actual scenes is severely restricted.

Disclosure of Invention

The embodiment of the invention provides a parallax estimation method, a parallax estimation device, a storage medium and a terminal. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.

In a first aspect, an embodiment of the present invention provides a disparity estimation method, where the method includes:

acquiring a preset parallax signal set, and generating a synthetic parallax value mask map after acquiring a legal parallax signal from the preset parallax signal set;

the method comprises the steps of obtaining a parallax signal guide module, inserting the parallax signal guide module into a parallax matching loss calculation module of a preset binocular parallax estimation model to generate an inserted binocular parallax estimation model;

expanding the legal parallax value mask image to generate an expanded legal parallax value mask image, and inputting the expanded legal parallax value mask image into the inserted binocular parallax estimation model to generate a legal parallax value;

inputting the legal parallax value into a preset Gaussian function model to generate a function value corresponding to the legal parallax value, and inputting the function value corresponding to the legal parallax value into a preset weight calculation formula to generate a weighted function value;

weighting the value of the parallax matching loss calculation module in the inserted binocular parallax estimation model based on the weighted function value to generate a weighted parallax matching loss value;

and inputting the weighted parallax matching loss value into the inserted binocular parallax estimation model to generate a parallax estimation image guided by a parallax signal.

Optionally, the preset gaussian function includes:

obtaining a vector corresponding to each parallax estimation value in a preset parallax estimation value set;

expanding the vectors corresponding to the parallax estimation values in the preset parallax estimation value set to generate an expanded parallax estimation value set;

taking the expanded parallax estimation value set as a parameter value of the Gaussian function model to obtain a preset Gaussian function model;

optionally, the preset weight calculation formula is w 1-1-M + M w, where M is the expanded legal disparity value mask map, w is a function value corresponding to the legal disparity value, and w1 is a weighted function value.

Optionally, before the acquiring the preset disparity signal set, the method further includes:

acquiring a first gray image and a second gray image by using a binocular camera, wherein the image acquired by a left camera corresponds to the first gray image, and the image acquired by a right camera corresponds to the second gray image;

screening the first gray level image and the second gray level image to generate a third gray level image and a fourth gray level image;

and calibrating the binocular camera based on a preset calibration program, the third gray image and the fourth gray image to generate calibration parameters.

Optionally, after obtaining the calibration parameter, the method further includes:

and when the calibration parameter reaches a preset calibration parameter value, correcting the third gray-scale image and the fourth gray-scale image based on the calibration parameter to generate a fifth gray-scale image and a sixth gray-scale image.

Optionally, after the third grayscale image and the fourth grayscale image are corrected based on the calibration parameter to generate a fifth grayscale image and a sixth grayscale image, the method further includes:

inputting the fifth grayscale image and the sixth grayscale image into a preset semi-global block matching disparity estimation method model to generate a disparity estimation image corresponding to the fifth grayscale image and a disparity estimation image corresponding to the sixth grayscale image;

consistency check is carried out on the parallax estimation image corresponding to the fifth gray level image and the parallax estimation image corresponding to the sixth gray level image, and the confidence coefficient of each pixel point parallax value in the parallax estimation image corresponding to the fifth gray level image is obtained to generate a confidence coefficient image;

inputting the parallax estimation image corresponding to the fifth gray image into a preset sub-pixel estimation method model to generate a parallax value with sub-pixel precision;

and setting the parallax value with the confidence degree lower than a preset threshold value in the parallax values with the sub-pixel precision as zero to generate a parallax signal set based on the confidence map, wherein the parallax signal set is used as a preset parallax signal set.

In a second aspect, an embodiment of the present invention provides a disparity estimation apparatus, including:

the parallax value mask image generation module is used for acquiring a preset parallax signal set, and generating a synthetic parallax value mask image after acquiring a legal parallax signal from the preset parallax signal set;

the model generation module is used for acquiring a parallax signal guide module, inserting the parallax signal guide module into a parallax matching loss calculation module of a preset binocular parallax estimation model, and generating an inserted binocular parallax estimation model;

a legal parallax value generation module, configured to expand the legal parallax value mask map to generate an expanded legal parallax value mask map, and input the expanded legal parallax value mask map into the inserted binocular parallax estimation model to generate a legal parallax value;

the function value generating module is used for inputting the legal parallax value into a preset Gaussian function model to generate a function value corresponding to the legal parallax value, and inputting the function value corresponding to the legal parallax value into a preset weight calculation formula to generate a weighted function value;

the loss value generating module is used for weighting the value of the parallax matching loss calculating module in the inserted binocular parallax estimation model based on the weighted function value to generate a weighted parallax matching loss value;

and the parallax estimation map generation module is used for inputting the weighted parallax matching loss value into the inserted binocular parallax estimation model to generate a parallax estimation map guided by a parallax signal.

Optionally, the apparatus further comprises:

the image acquisition module is used for acquiring a first gray image and a second gray image by using a binocular camera, wherein the image acquired by the left camera corresponds to the first gray image, and the image acquired by the right camera corresponds to the second gray image;

the image generation module is used for screening the first gray level image and the second gray level image and then generating a third gray level image and a fourth gray level image;

and the parameter generation module is used for calibrating the binocular camera based on a preset calibration program, the third gray image and the fourth gray image to generate calibration parameters.

Optionally, the apparatus further comprises:

and the image correction module is used for correcting the third gray image and the fourth gray image based on the calibration parameters to generate a fifth gray image and a sixth gray image when the calibration parameters reach preset calibration parameter values.

Optionally, the apparatus further comprises:

the parallax estimation image generation module is used for inputting the fifth gray image and the sixth gray image into a preset semi-global block matching parallax estimation method model to generate a parallax estimation image corresponding to the fifth gray image and a parallax estimation image corresponding to the sixth gray image;

the confidence map generation module is used for carrying out consistency check on the parallax estimation map corresponding to the fifth gray image and the parallax estimation map corresponding to the sixth gray image to obtain the confidence of the parallax value of each pixel point in the parallax estimation map corresponding to the fifth gray image so as to generate a confidence map;

the parallax value generating module is used for inputting the parallax estimation image corresponding to the fifth gray image into a preset sub-pixel estimation method model to generate a parallax value with sub-pixel precision;

and the set generating module is used for setting the parallax value with the confidence coefficient lower than a preset threshold value in the parallax values of the sub-pixel precision as zero to generate a parallax signal set based on the confidence coefficient map, wherein the parallax signal set is used as a preset parallax signal set.

In a third aspect, embodiments of the present invention provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned method steps.

In a fourth aspect, an embodiment of the present invention provides a terminal, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.

The technical scheme provided by the embodiment of the invention has the following beneficial effects:

in the embodiment of the present invention, the disparity estimation apparatus generates a synthesized disparity value mask map by obtaining a preset disparity signal set, obtaining a legal disparity signal in the preset disparity signal set, then obtaining a disparity signal guidance module, inserting the disparity signal guidance module into a disparity matching loss calculation module of a preset binocular disparity estimation model to generate an inserted binocular disparity estimation model, generating an expanded legal disparity value mask map by expanding the legal disparity value mask map, inputting the expanded legal disparity value mask map into the inserted binocular disparity estimation model to generate a legal disparity value, inputting the legal disparity value into a preset gaussian function model to generate a function value corresponding to the legal disparity value, inputting the function value corresponding to the legal disparity value into a preset weight calculation formula to generate a weighted function value, and then weighting the value of the parallax matching loss calculation module in the inserted binocular parallax estimation model based on the weighted function value to generate a weighted parallax matching loss value, and finally inputting the weighted parallax matching loss value into the inserted binocular parallax estimation model to generate a parallax estimation image guided by a parallax signal. Therefore, by adopting the embodiment of the invention, the scene adaptability of the binocular disparity estimation method based on the convolutional neural network can be effectively solved due to the sparse and highly-reliable disparity signal-guided binocular disparity estimation scheme provided by the invention.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a schematic flowchart of a disparity estimation method according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of a method for acquiring a disparity signal set according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a disparity estimation apparatus according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of another disparity estimation apparatus according to an embodiment of the present invention;

fig. 5 is a flowchart of acquiring a disparity estimation map according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

The following description and the drawings sufficiently illustrate specific embodiments of the invention to enable those skilled in the art to practice them.

It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

So far, the binocular disparity estimation method based on the convolutional neural network achieves breakthrough progress, and has higher improvement on disparity estimation accuracy and speed compared with the traditional method. However, when the existing binocular disparity estimation method based on the convolutional neural network is migrated to different data sets and scenes, serious disparity estimation accuracy loss occurs, even failure occurs, and thus application of the binocular disparity estimation method based on the convolutional neural network in actual scenes is severely restricted. To solve the problems involved in the related art described above. In the technical scheme provided by the application, due to the sparse and highly-reliable disparity signal-guided binocular disparity estimation scheme provided by the invention, the scene adaptability of the convolutional neural network-based binocular disparity estimation method can be effectively solved, and the following detailed description is given by adopting an exemplary embodiment.

The following describes the parallax estimation method according to an embodiment of the present invention in detail with reference to fig. 1 to fig. 2 and fig. 5. The method may be implemented in dependence on a computer program, executable on a disparity estimation device based on the von neumann architecture. The computer program may be integrated into the application or may run as a separate tool-like application. The disparity estimation apparatus in the embodiment of the present invention may be a user terminal, including but not limited to: personal computers, tablet computers, handheld devices, in-vehicle devices, wearable devices, computing devices or other processing devices connected to a wireless modem, and the like. The user terminals may be called different names in different networks, for example: user equipment, access terminal, subscriber unit, subscriber station, mobile station, remote terminal, mobile device, user terminal, wireless communication device, user agent or user equipment, cellular telephone, cordless telephone, Personal Digital Assistant (PDA), terminal equipment in a 5G network or future evolution network, and the like.

Referring to fig. 1, a flow chart of a disparity estimation method according to an embodiment of the present invention is schematically shown. As shown in fig. 1, the method of an embodiment of the present invention may include the steps of:

s101, acquiring a preset parallax signal set, and generating a synthetic parallax value mask map after acquiring a legal parallax signal from the preset parallax signal set;

the preset parallax signal set is a sparse and high-reliability parallax signal set obtained according to the method steps in fig. 2, it should be noted that, except for the method steps in fig. 2, the preset parallax signal set may be used in any other method capable of obtaining a sparse and high-reliability parallax signal, and meanwhile, any sensing data capable of directly outputting a sparse and high-reliability parallax signal may also be used. The legal parallax value mask image is a legal parallax value mask image generated by acquiring a legal parallax signal from the preset parallax signal set and processing the legal parallax signal.

In a possible embodiment, a standard chessboard is printed first, where the chessboard can be a gobang chessboard or a Chinese chess chessboard, and the specific chessboard can be determined according to actual situations and is not limited herein. After a standard chessboard is printed, the standard chessboard is fixed on a large flat plate, and then a binocular camera is connected to a computer, wherein the binocular camera is provided with two cameras, or can be a left camera and a right camera. After the binocular camera is successfully connected to a computer, a chessboard printed in advance is placed at different positions, and the binocular camera is utilized to collect gray images of the chessboard, wherein the images collected by the left camera can be regarded as first gray images, and the images collected by the right camera can be regarded as second gray images. And then screening the acquired first gray level image and the second gray level image by using the internal degree of the terminal, and removing the blurred points on the black and white grid points of the chessboard gray level image, the incomplete part of the chessboard gray level image and the images with repeated chessboard angles in the chessboard gray level image. And then calibrating the binocular camera by using the calibration program and the screened calibration image, generating calibration parameters after the calibration is finished, then verifying the calibration result by the terminal on the calibration parameters, and finishing the calibration of the binocular camera when the verification result meets the requirement.

After the calibration of the camera is finished, the terminal acquires calibrated parameters to correct the gray level image acquired by the binocular camera, and after the gray level image of the binocular camera is corrected, the same physical point of a real scene is projected to the same horizontal line of the gray level images acquired by the left camera and the right camera.

After the correction is finished, a semi-global block matching parallax estimation method is executed on the gray level image acquired by the corrected binocular camera, and a parallax estimation image taking the gray level image acquired by the left camera as a reference and a parallax estimation image taking the gray level image acquired by the right camera as a reference are calculated respectively. And then, consistency verification is carried out on the two calculated parallax estimation images, and a confidence map of the parallax value of each pixel point in the parallax estimation image taking the gray level image collected by the left camera as a reference is obtained after verification is finished. And then, acquiring a parallax value with sub-pixel precision in a parallax estimation image by using the gray image acquired by the left camera as a reference by using a sub-pixel estimation method. And finally, based on the confidence coefficient map, setting the parallax value of the sub-pixel precision in the parallax estimation map with the acquired gray level image acquired by the left camera as the reference to be zero, wherein the parallax value is smaller than a preset threshold value, and then obtaining a high-reliability and sparse parallax signal as a guide signal. And finally, taking the obtained high-reliability and sparse parallax signal as a preset parallax signal set.

S102, a parallax signal guide module is obtained, and the parallax signal guide module is inserted into a parallax matching loss calculation module of a preset binocular parallax estimation model to generate an inserted binocular parallax estimation model;

the binocular disparity estimation model is a mathematical model for disparity estimation, and the model is created based on a Convolutional Neural Network (CNN), wherein the Convolutional Neural Network (CNN) is a type of feed-forward Neural network (fed-forward Neural network) containing convolution calculation and having a deep structure, and is one of the representative algorithms of deep learning (deep learning). The convolutional neural network has a representation learning (representation learning) capability, and can carry out shift-invariant classification (shift-invariant classification) on input information according to a hierarchical structure of the convolutional neural network. The parallax matching loss calculation module is a core module of the binocular parallax estimation model, and the module has the main function of calculating loss values after binocular parallax matching. The disparity signal guidance module is a module that guides the high-confidence, sparse disparity signal in step S101.

In a possible implementation manner, the user terminal firstly acquires a disparity signal guidance module which is stored in the server and used for guiding the high-reliability sparse disparity signal by using an internal program, then acquires a pre-generated binocular disparity estimation model, then acquires a core module disparity matching loss calculation module in the binocular disparity estimation model, then divides a place output by the disparity matching loss calculation module into two parts by using the internal program, and then inserts the disparity signal guidance module into the divided place to generate an inserted binocular disparity estimation model. And finally, reducing the spatial resolution of the inserted parallax signal guide module to be as large as the spatial resolution of the place output by the parallax matching loss calculation module, and reducing and expanding the spatial resolution into a form with the same dimension as the place output by the parallax matching loss calculation module through dimension expansion and element copying.

S103, expanding the legal parallax value mask image to generate an expanded legal parallax value mask image, and inputting the expanded legal parallax value mask image into the inserted binocular parallax estimation model to generate a legal parallax value;

step S101 may be referred to for generating the legal disparity value mask map, and details are not described here. The binocular disparity estimation model may specifically refer to step S102, and details thereof are not repeated here.

Usually, the spatial resolution and the dimension of the disparity value mask map generated based on step S101 at this time are different from the place output by the disparity matching loss calculation module, the spatial resolution and the dimension of the disparity value mask map need to be processed to be the same as the resolution and the dimension of the place output by the disparity matching loss calculation module, and when the processed spatial resolution and dimension are the same as the resolution and the dimension of the place output by the disparity matching loss calculation module, the expanded legal disparity value mask map can be input into the inserted binocular disparity estimation model to generate a legal disparity value.

In the embodiment of the application, a user terminal firstly obtains a disparity value mask image generated in advance, then obtains the spatial resolution and the dimension of the disparity value mask image after analyzing and processing the disparity value mask image by using an internal preset program, then obtains the resolution and the dimension of the output place of a disparity matching loss calculation module, and finally processes the spatial resolution and the dimension of the disparity value mask image to enable the spatial resolution and the dimension to be the same as the resolution and the dimension of the output place of the disparity matching loss calculation module. When the spatial resolution and the dimension of the parallax value mask map are the same as those of the output place of the parallax matching loss calculation module, the user terminal inputs the expanded parallax value mask map into the binocular parallax estimation model inserted into the parallax signal guidance module to generate a legal parallax value.

S104, inputting the legal parallax value into a preset Gaussian function model to generate a function value corresponding to the legal parallax value, and inputting the function value corresponding to the legal parallax value into a preset weight calculation formula to generate a weighted function value;

among them, the gaussian function is applied to the field of statistics, and in the field of image processing, the gaussian function is often used for gaussian blur. And the function value corresponding to the legal parallax value is generated by inputting the legal parallax value into a preset Gauss function model, wherein the function parameter value of the Gauss function is obtained by processing a vector corresponding to the value estimated based on the preset parallax, and the obtained parameter is used as the center of the Gauss function.

Generally, a user terminal first obtains a preset disparity estimation value set, and performs vectorization representation on each disparity value in the preset disparity estimation value set, as shown in table 1:

TABLE 1

Preset disparity value	1	2	maxD
				Corresponding vector	0	1	maxD-1

In table 1, the vector corresponding to the preset disparity value 1 is 0, the vector corresponding to the preset value 2 is 1, and the vector corresponding to the preset value maxD is maxD-1. And then expanding the spatial resolution and the dimensionality of the vectorized parallax value to ensure that the spatial resolution and the dimensionality corresponding to the preset vectorized parallax value are kept the same as the resolution and the dimensionality of the output place of the parallax matching loss calculation module, and then taking the expanded preset vector as the center of a Gaussian function.

In the embodiment of the present invention, firstly, the generated legal disparity value is input into the high-dimensional gaussian function model, a function value corresponding to the legal disparity value is generated and is recorded as a weight W, and then the weight W is input into a preset formula W1 ═ 1-M + M × W, so as to obtain a new weighting weight, that is, a weighted function value is generated; m is the expanded legal disparity value mask map, and w1 is the weighted function value.

S105, weighting the value of the parallax matching loss calculation module in the inserted binocular parallax estimation model based on the weighted function value to generate a weighted parallax matching loss value;

in a possible implementation manner, the user terminal first obtains the weighted function value obtained in step S104, then obtains the value of the core module parallax loss calculation module in the binocular parallax estimation model inserted into the parallax signal guidance module, and finally adds the weighted function value to the value of the core module parallax loss calculation module to generate a weighted parallax matching loss value.

And S106, inputting the weighted parallax matching loss value into the inserted binocular parallax estimation model to generate a parallax estimation image guided by the parallax signal.

In the embodiment of the invention, the weighted parallax matching loss value is obtained according to the step S105, and finally the weighted parallax matching loss value is input into the binocular parallax estimation model to generate the parallax estimation image guided by the parallax signal.

For example, as shown in fig. 5, a chessboard is placed at different positions, then grayscale images corresponding to left and right cameras are collected, then collected grayscale image data samples and programs for calibrating the left and right cameras are used to calibrate the cameras, calibration parameters are calculated, a user terminal finally judges the quality of the calibration parameters, when the quality of the calibration parameters reaches a set quality threshold, consistency correction is performed on the grayscale images of the left and right cameras, the correction makes the same physical point of the images of the left and right cameras on the same horizontal line, and then a sparse and highly reliable disparity signal set is obtained based on a preset mode. And finally, guiding binocular disparity estimation according to the acquired sparse and high-reliability disparity signal set.

In binocular disparity estimation, firstly, processing a legal disparity signal, then using the weight guided by the legal disparity signal, then carrying out weighting processing on disparity matching loss in a disparity estimation model, then estimating disparity according to a weighted disparity matching loss value, and finally obtaining a sparse disparity estimation image guided by a high-reliability disparity signal.

Referring to fig. 2, before step S101, the user terminal needs to acquire an image and acquire a disparity signal set according to the image. Optionally, the acquiring process of the disparity signal set includes, but is not limited to, the following steps:

s201, a binocular camera is used for collecting a first gray image and a second gray image, wherein the image obtained by a left camera corresponds to the first gray image, and the image obtained by a right camera corresponds to the second gray image;

wherein, the binocular camera can be understood as the equipment of two cameras, and one of them camera is the left side, and the other is on the right side. The first gray level image and the second gray level image are both gray level images of a preset chessboard collected by a camera.

In a possible implementation manner, a standard chessboard is printed first, where the chessboard can be a gobang chessboard or a Chinese chess chessboard, and the specific chessboard can be determined according to actual situations and is not limited herein. After a standard chessboard is printed, the standard chessboard is fixed on a large flat plate, and then a binocular camera is connected to a computer, wherein the binocular camera is provided with two cameras, or can be a left camera and a right camera. After the binocular camera is successfully connected to a computer, a chessboard printed in advance is placed at different positions, and the binocular camera is utilized to collect gray images of the chessboard, wherein the images collected by the left camera can be regarded as first gray images, and the images collected by the right camera can be regarded as second gray images.

S202, screening the first gray level image and the second gray level image to generate a third gray level image and a fourth gray level image;

the third grayscale image corresponds to a grayscale image generated by screening the first grayscale image, and the fourth grayscale image corresponds to a grayscale image generated by screening the second grayscale image. The screening is to remove the parts of the first gray image and the second gray image which are indistinct in blurring and defective and bad.

In a feasible implementation mode, the terminal screens the acquired first gray level image and the acquired second gray level image through an internal program, and eliminates blurred points on black and white grid points of the chessboard gray level image, incomplete parts of the chessboard gray level image and repeated images of the chessboard angle in the chessboard gray level image.

S203, calibrating the binocular camera based on a preset calibration program, the third gray image and the fourth gray image to generate calibration parameters;

in the embodiment of the application, the user terminal calibrates the binocular camera by using the calibration program and the screened calibration image, generates calibration parameters after calibration is finished, then verifies the calibration result by the terminal, and finishes calibration of the binocular camera when the verification result meets the requirement.

S204, when the calibration parameter reaches a preset calibration parameter value, correcting the third gray image and the fourth gray image based on the calibration parameter to generate a fifth gray image and a sixth gray image;

the preset calibration parameter value is a measurement parameter for measuring whether the calibration parameter generated in step S203 meets an expected standard.

In this embodiment of the application, when the calibration parameter value generated in step S203 is greater than or equal to the preset calibration parameter value, the user terminal corrects the third grayscale image and the fourth grayscale image by using the calibration parameter generated in step S203 to generate a fifth grayscale image and a sixth grayscale image;

s205, inputting the fifth grayscale image and the sixth grayscale image into a preset semi-global block matching disparity estimation method model to generate a disparity estimation image corresponding to the fifth grayscale image and a disparity estimation image corresponding to the sixth grayscale image;

the semi-global block matching parallax estimation method model has the characteristics of good parallax effect and high speed, and is used for binocular parallax estimation.

In the embodiment of the present invention, the user terminal first generates a fifth grayscale image and a sixth grayscale image based on step S204, then obtains a semi-global block matching disparity estimation method model stored in the server, and finally inputs the fifth grayscale image and the sixth grayscale image into the semi-global block matching disparity estimation method model to generate a disparity estimation map corresponding to the fifth grayscale image and a disparity estimation map corresponding to the sixth grayscale image.

S206, performing consistency check on the parallax estimation image corresponding to the fifth gray level image and the parallax estimation image corresponding to the sixth gray level image to obtain a confidence coefficient of each pixel point parallax value in the parallax estimation image corresponding to the fifth gray level image to generate a confidence coefficient image;

in the embodiment of the application, firstly, consistency verification is performed on the two calculated parallax estimation images, and after the verification is finished, a confidence map of the parallax value of each pixel point in the parallax estimation image taking the gray level image acquired by the left camera as a reference is obtained. And then, acquiring a parallax value with sub-pixel precision in a parallax estimation image by using the gray image acquired by the left camera as a reference by using a sub-pixel estimation method.

S207, inputting the parallax estimation image corresponding to the fifth gray image into a preset sub-pixel estimation method model to generate a parallax value with sub-pixel precision;

specifically, refer to step S206, which is not described herein again.

And S208, setting the parallax value with the confidence coefficient lower than a preset threshold value in the parallax values with the sub-pixel precision as zero based on the confidence coefficient map, and generating a parallax signal set, wherein the parallax signal set is used as a preset parallax signal set.

In the embodiment of the application, based on the confidence map, firstly, the disparity value with the sub-pixel precision in the disparity estimation map with the acquired gray image acquired by the left camera as the reference is set to be zero, and then the high-reliability and sparse disparity signal can be obtained and used as the guiding signal. And finally, taking the obtained high-reliability and sparse parallax signal as a preset parallax signal set.

It should be noted that, besides the method illustrated in fig. 2, any method capable of obtaining the disparity signal set may be used. Also, any sensor data that can directly output the set of disparity signals is available, such as a LIDAR sensor. After the set of disparity signals is obtained according to the above method, the method steps of S101 to S106 in fig. 1 may be performed.

The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.

Please refer to fig. 3, which shows a schematic structural diagram of a disparity estimation apparatus according to an exemplary embodiment of the present invention. The disparity estimation method device can be realized by software, hardware or a combination of the two to form all or part of a terminal. The device 1 comprises a parallax value mask map generating module 10, a model generating module 20, a legal parallax value generating module 30, a function value generating module 40, a loss value generating module 50 and a parallax estimation map generating module 60.

A disparity value mask map generating module 10, configured to obtain a preset disparity signal set, and generate a synthetic disparity value mask map after obtaining a legal disparity signal from the preset disparity signal set;

the model generation module 20 is configured to obtain a disparity signal guidance module, insert the disparity signal guidance module into a disparity matching loss calculation module of a preset binocular disparity estimation model, and generate an inserted binocular disparity estimation model;

a legal disparity value generating module 30, configured to expand the legal disparity value mask map to generate an expanded legal disparity value mask map, and input the expanded legal disparity value mask map into the inserted binocular disparity estimation model to generate a legal disparity value;

a function value generating module 40, configured to input the legal disparity value into a preset gaussian function model to generate a function value corresponding to the legal disparity value, and input the function value corresponding to the legal disparity value into a preset weight calculation formula to generate a weighted function value;

a loss value generating module 50 for weighting the value of the parallax matching loss calculating module in the inserted binocular parallax estimation model based on the weighted function value to generate a weighted parallax matching loss value;

and a disparity estimation map generation module 60, configured to input the weighted disparity matching loss value into the inserted binocular disparity estimation model to generate a disparity estimation map guided by disparity signals.

Optionally, as shown in fig. 4, the apparatus 1 further includes:

the image acquisition module 70 is configured to acquire a first grayscale image and a second grayscale image by using a binocular camera, where an image acquired by a left camera corresponds to the first grayscale image, and an image acquired by a right camera corresponds to the second grayscale image;

an image generating module 80, configured to filter the first grayscale image and the second grayscale image and generate a third grayscale image and a fourth grayscale image;

the parameter generating module 90 calibrates the binocular camera based on a preset calibration program and the third and fourth grayscale images to generate calibration parameters.

Optionally, as shown in fig. 4, the apparatus 1 further includes:

and the image correction module 100 is configured to correct the third grayscale image and the fourth grayscale image based on the calibration parameter to generate a fifth grayscale image and a sixth grayscale image when the calibration parameter reaches a preset calibration parameter value.

Optionally, as shown in fig. 4, the apparatus 1 further includes:

a disparity estimation map generation module 110, configured to input the fifth grayscale image and the sixth grayscale image into a preset semi-global block matching disparity estimation method model to generate a disparity estimation map corresponding to the fifth grayscale image and a disparity estimation map corresponding to the sixth grayscale image;

a confidence map generation module 120, configured to perform consistency check on the disparity estimation map corresponding to the fifth grayscale image and the disparity estimation map corresponding to the sixth grayscale image, and obtain a confidence of the disparity value of each pixel in the disparity estimation map corresponding to the fifth grayscale image to generate a confidence map;

a disparity value generating module 130, configured to input the disparity estimation map corresponding to the fifth grayscale image into a preset sub-pixel estimation method model to generate a disparity value with sub-pixel precision;

a set generating module 140, configured to set, based on the confidence map, a disparity value with a confidence level lower than a preset threshold in the disparity values of the sub-pixel precision to zero to generate a disparity signal set, where the disparity signal set is a preset disparity signal set.

It should be noted that, when the disparity estimation apparatus provided in the foregoing embodiment performs the disparity estimation method, only the division of the functional modules is illustrated, and in practical applications, the above functions may be distributed by different functional modules according to needs, that is, the internal structure of the apparatus is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the disparity estimation apparatus and the disparity estimation method provided in the above embodiments belong to the same concept, and details of implementation processes thereof are referred to as method embodiments, and are not described herein again.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The present invention also provides a computer readable medium having stored thereon program instructions that, when executed by a processor, implement the disparity estimation method provided by the various method embodiments described above.

The present invention also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the disparity estimation method as described in the various method embodiments above.

Fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention. As shown in fig. 6, the terminal 1000 can include: at least one processor 1001, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002.

Wherein a communication bus 1002 is used to enable connective communication between these components.

The user interface 1003 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.

The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.

Processor 1001 may include one or more processing cores, among other things. The processor 1001 interfaces various components throughout the electronic device 1000 using various interfaces and lines to perform various functions of the electronic device 1000 and to process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1005 and invoking data stored in the memory 1005. Alternatively, the processor 1001 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 1001 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 1001, but may be implemented by a single chip.

The Memory 1005 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 1005 includes a non-transitory computer-readable medium. The memory 1005 may be used to store an instruction, a program, code, a set of codes, or a set of instructions. The memory 1005 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 6, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a disparity estimation application program.

In the terminal 1000 shown in fig. 6, the user interface 1003 is mainly used as an interface for providing input for a user, and acquiring data input by the user; and the processor 1001 may be configured to call the disparity estimation application stored in the memory 1005, and specifically perform the following operations:

acquiring a target pedestrian image;

inputting the target pedestrian image into a pre-trained pedestrian attribute recognition model, wherein the pedestrian attribute recognition model is generated based on a first data sample and a second data sample, and the second data sample is generated by inputting the first data sample into a pre-trained style transition model;

and outputting each attribute value corresponding to the target pedestrian image.

In one embodiment, the processor 1001 further performs the following operations before performing the acquiring of the target pedestrian image:

In one embodiment, before performing the acquiring the preset disparity signal set, the processor 1001 further performs the following operations:

In one embodiment, after the obtaining calibration parameters, the processor 1001 further performs the following operations:

In one embodiment, after performing the correction of the third grayscale image and the fourth grayscale image based on the calibration parameters to generate a fifth grayscale image and a sixth grayscale image, the processor 1001 further performs the following operations:

Those of skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments disclosed herein, it should be understood that the disclosed methods, articles of manufacture (including but not limited to devices, apparatuses, etc.) may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment. In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

It should be understood that the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. The present invention is not limited to the procedures and structures that have been described above and shown in the drawings, and various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A disparity estimation method, characterized in that the method comprises:

2. The method of claim 1, wherein the predetermined gaussian function comprises:

and taking the expanded parallax estimation value set as a parameter value of the Gaussian function model to obtain a preset Gaussian function model.

3. The method of claim 1, wherein the predetermined weight calculation formula is w 1-1-M + M w, where M is the extended legal disparity value mask, w is a function value corresponding to the legal disparity value, and w1 is a weighted function value.

4. The method according to claim 1, wherein before the obtaining the preset disparity signal set, further comprising:

5. The method of claim 4, wherein after generating the calibration parameters, further comprising:

6. The method of claim 5, wherein after the correcting the third grayscale image and the fourth grayscale image based on the calibration parameters to generate a fifth grayscale image and a sixth grayscale image, the method further comprises:

7. The method according to claim 1, wherein the network structure of the binocular disparity estimation model is a PSMNet model or an iResNet model.

8. A disparity estimation apparatus, characterized in that the apparatus comprises:

9. A computer storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor and to carry out the method steps according to any one of claims 1 to 7.

10. A terminal, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1 to 7.