CN113592021B - Stereo matching method based on deformable and depth separable convolution - Google Patents

Stereo matching method based on deformable and depth separable convolution Download PDF

Info

Publication number
CN113592021B
CN113592021B CN202110916262.2A CN202110916262A CN113592021B CN 113592021 B CN113592021 B CN 113592021B CN 202110916262 A CN202110916262 A CN 202110916262A CN 113592021 B CN113592021 B CN 113592021B
Authority
CN
China
Prior art keywords
convolution
deformable
image
depth
stereo matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110916262.2A
Other languages
Chinese (zh)
Other versions
CN113592021A (en
Inventor
高会敏
徐志京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maritime University
Original Assignee
Shanghai Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maritime University filed Critical Shanghai Maritime University
Priority to CN202110916262.2A priority Critical patent/CN113592021B/en
Publication of CN113592021A publication Critical patent/CN113592021A/en
Application granted granted Critical
Publication of CN113592021B publication Critical patent/CN113592021B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a stereo matching method based on deformable and depth separable convolution, which comprises the following steps: inputting a left image and a right image into a deformable feature extraction network model to extract effective features, wherein the left image and the right image are two images obtained by a binocular vision camera respectively; the effective characteristics are subjected to cascade operation, and cost quantity is obtained after fusion; inputting the cost quantity to a depth-separable 3DCNN network model, learning the characteristics of different scales, positions and forms, and aggregating effective information to obtain a 3DCNN network-learned image; restoring the 3DCNN network-learned image to the original image size using upsampling; and performing parallax regression prediction on the restored image by using a softmax function, and outputting a parallax map. By applying the embodiment of the invention, the object characteristic deformation is self-adaptive, the effective receptive field is enlarged, and the information loss is reduced; the depth can be separated and convolved, and the depth can be integrated into a learning network, so that huge parameters brought by 3DCNN are reduced, and the operand is reduced.

Description

Stereo matching method based on deformable and depth separable convolution
Technical Field
The invention relates to the technical field of machine vision and binocular vision, in particular to a stereo matching method based on deformable and depth separable convolution.
Background
Computer vision is a discipline that has studied using computers to simulate the human visual system, with binocular stereo vision being an important branch of the computer vision field. The system processes the real world by simulating a human visual system, and can be put into use by simulating the human eye perception principle and only needing two cameras to be installed on the same horizontal line and three-dimensionally correcting. The basic flow comprises the following steps: acquiring an image, calibrating a camera, correcting the image, extracting characteristics, performing stereo matching and reconstructing three dimensions. The most important step in binocular stereoscopic vision is stereoscopic matching, which is an important basis in the field of binocular vision, and with the development of computer vision, stereoscopic matching is widely applied, such as: autopilot, 3D modeling, industrial control, etc. Where stereo matching can be achieved with limited performance based on traditional stereo matching, but for pathological areas such as: the weak texture, parallax discontinuity, uneven radiation, etc. cannot achieve good effects. In recent years, stereo matching based on deep learning has greatly advanced compared with the traditional algorithm, and a convolutional neural network has super strong capability in feature extraction, and the convolutional neural network regards stereo matching as a learning task, continuously learns optimized model parameters from a large amount of data, and finally outputs a disparity map.
Because of various interferences in the real world, the three-dimensional matching method based on the convolutional neural network in early stage is simpler, good results are not obtained for a pathological region, the precision, time efficiency and the like of an algorithm are greatly improved in the deep learning mode in recent years, and a great progress space is still reserved for the complexity brought by the deformation of the characteristics and the 3D convolution.
Disclosure of Invention
The invention aims to provide a three-dimensional matching method based on deformable and depth separable convolution, which aims to solve the existing defects, solves the problems of the traditional convolution fixed sampling mode through the novel convolution constructed by the deformable convolution and the deformable convolution kernel, adapts to the characteristic deformation of an object, enlarges the effective receptive field and reduces the information loss; the depth-separable convolution is integrated into a learning network, so that huge parameters brought by the 3D CNN are reduced, and the operand is reduced.
In order to achieve the above object, the present invention provides a stereo matching method based on deformable and depth separable convolution, comprising:
inputting a left image and a right image into a deformable feature extraction network model to extract effective features, wherein the left image and the right image are respectively two images obtained by a binocular vision camera
The effective characteristics are subjected to cascade operation, and cost quantity is obtained after fusion;
inputting the cost quantity to a depth-separable 3D CNN network model, learning the characteristics of different scales, positions and forms, and aggregating effective information to obtain a 3D CNN network-learned image;
restoring the image after 3D CNN network learning into the size of the left image by using upsampling, wherein the sizes of the left image and the right image are the same;
performing parallax regression prediction on the restored image by using a softmax function, and outputting a parallax map;
iterative training is carried out on the three-dimensional matching integral network, and a joint loss function is used in the training process:
L=L L1 +λL Log-cosh
wherein L is L1 To smooth the L1 loss function, L Log-cosh Is a Log-dash loss function, lambda is L Log-cosh The stereo matching overall network comprises: extracting a network model and a 3D CNN network from the deformable characteristics;
L L1 for smoothloss:
L Log-cosh is Log-hash_loss:
wherein: n is the number of marked pixels, d is the background true value,to predict the disparity value.
Optionally, the deformable feature extraction network model is a novel convolution constructed by combining a deformable convolution and a deformable convolution kernel.
In one implementation, the deformable convolution sum and deformable convolution kernel output pixels are:
the effective receptive fields are as follows:
wherein I represents an image and W represents a convolution kernel; i, j represents the sampling position, k represents the convolution kernel position, m represents the mth convolution kernel, n represents the index of each layer, Δj represents the offset of the sampling position j, and Δk represents the offset of the convolution kernel position k.
Optionally, the effective features are subjected to cascade operation, and the cost is obtained after fusion:
connecting the extracted features through concat cascading operation;
the connected features are rearranged and combined by convolution to form new features that fuse the connected features.
In one implementation, the depth separable 3D CNN network model includes: a depth convolution of 3*3 and a point-by-point convolution of 1*1.
Optionally, the deformable feature extraction network further includes: batch normalization layer and leak activation layer.
In one implementation, the Softmax function has the expression:
wherein Z is i Representative is the linear prediction result of the ith class, σ (Z) i i=0, 1,2,3,... Is the probability that the data belongs to category i.
In one implementation, the convolution formula for upsampling is:
y=G*x t +b t
wherein G is two-dimensional Gaussian distribution, b t The gaussian kernel formula, representing the convolution bias, is:
wherein (x, y) is a distributed coordinate point, (x) 0 ,y 0 ) As the center point coordinates, sigma 2 Is the variance.
The stereo matching method based on deformable and depth separable convolution has the following beneficial effects:
(1) The stereo matching pathological region is as follows: the problems of matching precision caused by weak textures, uneven radiation and discontinuous parallax are solved, and the deformation problem of the characteristics is not considered in the conventional stereo matching. The invention can deform the characteristic extraction network, self-adapt to the deformation requirement of the object, enlarge the effective receptive field, and extract more effective characteristics.
(2) Although the traditional 3D CNN can bring about a good matching effect, more time is sacrificed, and the algorithm operation amount and the network parameters are increased. According to the depth separable 3D CNN network, the characteristics of different dimensions, positions and forms are learned, information of different dimensions is aggregated, and the operation complexity of an algorithm is greatly reduced.
(2) The invention introduces the joint loss function, performs common optimization on the network, can play the role of an algorithm, and improves the stereo matching precision.
Drawings
FIG. 1 is a schematic flow chart of a stereo matching method based on deformable and depth separable convolution according to an embodiment of the invention
Fig. 2 is another flow chart of a stereo matching method based on deformable and depth separable convolution according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a deformable feature extraction network according to an embodiment of the invention.
Fig. 4 is a schematic diagram of a depth separable 3D CNN network model according to an embodiment of the present invention.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention.
The present invention provides a stereo matching method based on deformable and depth separable convolution as shown in fig. 1-2, comprising:
s110, inputting a left image and a right image into a deformable feature extraction network model to extract effective features, wherein the left image and the right image are two images obtained by a binocular vision camera respectively;
it should be noted that, in the existing stereo matching method based on the neural network, more contexts and multi-scale information are aggregated, details are added, and problems of deformation of objects are rarely considered. The invention provides a feature extraction method of a deformable convolutional neural network.
(1) Traditional convolution:
in a general convolution, the output image is I ε R D×D The convolution kernel is W E R k×k The output image is at each coordinate j=r 2 The pixels at this point are:
wherein,i represents an image, and W represents a convolution kernel; i, j represents the sampling position, k represents the convolution kernel position, m represents the mth convolution kernel, and n represents the index of each layer.
The effective receptive fields are as follows:
(2) Deformable convolution:
the deformable convolution has stronger expressive power on the characteristics of the object such as the scale, the gesture, the deformation and the like, and compared with the general convolution, the deformable convolution output pixel is as follows:
the effective receptive fields are as follows:
where Δj represents the offset of sampling position j.
In the sampling process, an offset is added to each sampling point position, so that sampling of different positions and deformation is realized, and the receptive field is enlarged. The characteristic points of the general convolution have a receptive field with a fixed size, and the deformable convolution can adaptively learn the receptive field according to the shape and the size of the object, so that the characteristic of the object is more met, and the characteristic extraction is facilitated.
(3) Deformable convolution kernel:
the deformable convolution kernel output pixels are:
the effective receptive fields are as follows:
where Δk represents the offset of the convolution kernel position k.
The general convolution kernel cannot adapt to the deformation of the feature, and the deformable convolution kernel can adjust the kernel space while keeping the feature points unchanged, and compared with the general convolution kernel, the deformable convolution kernel shares the data position but has different sampling kernel values.
(4) Novel convolution:
the deformable convolution + deformable convolution kernel output pixels are:
the effective receptive fields are as follows:
the deformable feature extraction network of the invention: and (3) combining the deformable convolution and the deformable convolution kernel to construct novel convolution, inputting a left image and a right image obtained by the binocular vision camera into a deformable feature extraction network model to extract effective features, adapting to the deformation requirement of an object, and extracting more effective features. The deformable feature extraction network further comprises: batch normalization layer and leak activation layer. As in fig. 3. The batch normalization layer and the leack activation layer are common convolution layers in the neural network, and the batch normalization effect is that each layer of input of the network is normalized; the lean activation layer uses a lean activation function to map the inputs of neurons to outputs, increasing the nonlinearity of the neural network.
S120, the effective features are subjected to cascading operation, and cost quantity is obtained after fusion;
it can be understood that the extracted effective features are connected through a concat cascade operation, the connected features are rearranged and combined by a convolution operation, and are fused into new features to form cost volume, and the new features are fused with the connected features.
S130, inputting the cost amount to a depth-separable 3D CNN network, learning features of different scales, positions and forms, and aggregating effective information to obtain a 3D CNN network-learned image;
in order to improve accuracy, the cost volume is learned through the depth separable 3D CNN convolutional neural network, effective information is aggregated, and the parameters and the operation complexity caused by the 3D CNN convolutional are reduced through space and channel dimension calculation.
The depth separable 3D CNN convolution decomposes the general convolution into one 3*3 depth convolution (depthwise convolution) and one 1*1 point-by-point convolution (pointwise convolution). In general, the input feature map size is H×W×C 1 Convolution kernel sizeThe size of the output characteristic diagram is KXK, and the size of the output characteristic diagram is H XW XC 2 The amount of computation of the general convolution is: alpha T =H×W×C 1 ×C 2 X K. The calculation amount of the depth separable convolution is as follows: alpha D =H×W×C 1 ×K×K+H×W×C 2 X K. As can be seen from the formula, the operation amount of the depth separable convolution is greatly reduced, the operation complexity and the time complexity are reduced, and the accuracy of the algorithm is improved. The depth separable 3D CNN network model is constructed as in fig. 4.
S140, restoring the image after 3D CNN network learning into the size of the left image by using up-sampling, wherein the sizes of the left image and the right image are the same;
it will be appreciated that the original image undergoes convolution operations of different sizes to change the size of the image, so the present invention uses up-sampling to restore the 3D CNN network-learned image to the original image size.
The convolution formula is: y=g×x t +b t
Wherein G is a Gaussian convolution kernel, b t The gaussian kernel formula, representing the convolution bias, is:
wherein (x, y) is a distributed coordinate point, (x) 0 ,y 0 ) As the center point coordinates, sigma 2 Is the variance.
Up-sampling principle: the image is enlarged, i.e. interpolated. If an image is to be enlarged, it is obtained by an up-sampling operation, as follows:
(1) Expanding the image by 2 times in each direction, and filling the newly added rows and columns with 0;
(2) The approximation of the newly added pixel is obtained by convolving the enlarged image with the convolution kernel 4 described above.
And S150, performing parallax regression prediction on the restored image by using a softmax function, and outputting a parallax map.
Note that Softmax:
wherein Z is i Representative is the linear prediction result of the ith class, σ (Z) i i=0, 1,2,3,... Is the probability that the data belongs to category i.
Regression prediction: the probability of each disparity d is based on the matching cost c by a softmax function σ (·) d Calculating, predicting parallaxThe probability for each disparity d is weighted. The formula is as follows:
the stereo matching overall network comprises: in order to better train a network and improve the matching precision of a stereoscopic matching integral network, a combination loss function is introduced in the invention:
L=L L1 +λL Log-cosh
wherein L is L1 To smooth the L1 loss function, L Log-cosh Is a Log-dash loss function, lambda is L Log-cosh For balancing the importance of the two loss functions, the value in the experiment was 0.1.
L L1 For smoothloss:
L Log-cosh is Log-hash_loss:
wherein: n is the number of marked pixels, d is the background true value,to predict the disparity value.
The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications and variations of the invention be covered by the claims, which are within the ordinary skill of the art, be within the spirit and scope of the present disclosure.

Claims (7)

1. A stereo matching method based on deformable and depth separable convolution, comprising:
inputting a left image and a right image into a deformable feature extraction network model to extract effective features, wherein the left image and the right image are two images obtained by a binocular vision camera respectively;
the effective characteristics are subjected to cascade operation, and cost quantity is obtained after fusion;
inputting the cost quantity to a depth-separable 3D CNN network, learning the characteristics of different scales, positions and forms, and aggregating effective information to obtain a 3D CNN network-learned image;
restoring the image after 3D CNN network learning into the size of the left image by using upsampling, wherein the sizes of the left image and the right image are the same;
performing parallax regression prediction on the restored image by using a softmax function, and outputting a parallax map;
iterative training is carried out on the three-dimensional matching integral network, and a joint loss function is used in the training process:
L=L L1 +λL Log-cosh
wherein L is L1 To smooth the L1 loss function, L Log-cosh Is a Log-dash loss function, lambda is L Log-cosh The stereo matching overall network comprises: extracting a network model and a 3D CNN network from the deformable characteristics;
L L1 for smoothloss:
L Log-cosh is Log-hash_loss:
wherein N is the number of marked pixels, d is the background true value,is a predictive disparity value;
the deformable convolution and deformable convolution kernel output pixels are:
the effective receptive fields are as follows:
wherein I represents an image, W represents a convolution kernel, I, j represents a sampling position, k represents a convolution kernel position, m represents an mth convolution kernel, n represents an index of each layer, Δj represents an offset of the sampling position j, and Δk represents an offset of the convolution kernel position k.
2. The stereo matching method based on deformable and depth separable convolution according to claim 1, wherein the deformable feature extraction network model is a novel convolution constructed by combining a deformable convolution and a deformable convolution kernel.
3. A stereo matching method based on deformable and depth separable convolution according to claim 1, characterized by the step of obtaining the cost quantity after the fusion of the valid features by cascade operation:
connecting the extracted features through concat cascading operation;
the connected features are rearranged and combined by convolution to form new features that fuse the connected features.
4. The stereo matching method based on deformable and depth separable convolution according to claim 1, wherein the depth separable 3D CNN network model comprises: a depth convolution of 3*3 and a point-by-point convolution of 1*1.
5. A stereo matching method based on deformable and depth separable convolution as recited in any one of claims 1 to 4, wherein the deformable feature extraction network further comprises: batch normalization layer and leak activation layer.
6. The stereo matching method based on deformable and depth separable convolution according to claim 1, wherein the Softmax function has the expression:
wherein Z is i Representative is the linear prediction result of the i-th class, i=0, 1,2,3 i Is the probability that the data belongs to category i.
7. A stereo matching method based on deformable and depth separable convolution according to claim 1, characterized in that the convolution formula of up-sampling is:
y=G*x t +b t
wherein G is two-dimensional Gaussian distribution, b t The gaussian kernel formula, representing the convolution bias, is:
wherein (x, y) is a distributed coordinate point, (x) 0 ,y 0 ) As the center point coordinates, sigma 2 Is the variance.
CN202110916262.2A 2021-08-11 2021-08-11 Stereo matching method based on deformable and depth separable convolution Active CN113592021B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110916262.2A CN113592021B (en) 2021-08-11 2021-08-11 Stereo matching method based on deformable and depth separable convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110916262.2A CN113592021B (en) 2021-08-11 2021-08-11 Stereo matching method based on deformable and depth separable convolution

Publications (2)

Publication Number Publication Date
CN113592021A CN113592021A (en) 2021-11-02
CN113592021B true CN113592021B (en) 2024-03-22

Family

ID=78256983

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110916262.2A Active CN113592021B (en) 2021-08-11 2021-08-11 Stereo matching method based on deformable and depth separable convolution

Country Status (1)

Country Link
CN (1) CN113592021B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998453A (en) * 2022-08-08 2022-09-02 国网浙江省电力有限公司宁波供电公司 Stereo matching model based on high-scale unit and application method thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018119808A1 (en) * 2016-12-29 2018-07-05 浙江工商大学 Stereo video generation method based on 3d convolutional neural network
CN110533712A (en) * 2019-08-26 2019-12-03 北京工业大学 A kind of binocular solid matching process based on convolutional neural networks
CN111402129A (en) * 2020-02-21 2020-07-10 西安交通大学 Binocular stereo matching method based on joint up-sampling convolutional neural network
CN111696148A (en) * 2020-06-17 2020-09-22 中国科学技术大学 End-to-end stereo matching method based on convolutional neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018119808A1 (en) * 2016-12-29 2018-07-05 浙江工商大学 Stereo video generation method based on 3d convolutional neural network
CN110533712A (en) * 2019-08-26 2019-12-03 北京工业大学 A kind of binocular solid matching process based on convolutional neural networks
CN111402129A (en) * 2020-02-21 2020-07-10 西安交通大学 Binocular stereo matching method based on joint up-sampling convolutional neural network
CN111696148A (en) * 2020-06-17 2020-09-22 中国科学技术大学 End-to-end stereo matching method based on convolutional neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种基于PSMNet改进的立体匹配算法;刘建国;冯云剑;纪郭;颜伏伍;朱仕卓;;华南理工大学学报(自然科学版)(01);全文 *
基于自监督学习的番茄植株图像深度估计方法;周云成;许童羽;邓寒冰;苗腾;吴琼;;农业工程学报(24);全文 *

Also Published As

Publication number Publication date
CN113592021A (en) 2021-11-02

Similar Documents

Publication Publication Date Title
CN112270249B (en) Target pose estimation method integrating RGB-D visual characteristics
CN110728219B (en) 3D face generation method based on multi-column multi-scale graph convolution neural network
CN110705448B (en) Human body detection method and device
CN111275518B (en) Video virtual fitting method and device based on mixed optical flow
CN111428586B (en) Three-dimensional human body posture estimation method based on feature fusion and sample enhancement
CN111161349B (en) Object posture estimation method, device and equipment
CN108921926B (en) End-to-end three-dimensional face reconstruction method based on single image
CN108416266B (en) Method for rapidly identifying video behaviors by extracting moving object through optical flow
CN106780592A (en) Kinect depth reconstruction algorithms based on camera motion and image light and shade
CN110738697A (en) Monocular depth estimation method based on deep learning
CN106981080A (en) Night unmanned vehicle scene depth method of estimation based on infrared image and radar data
CN104899921B (en) Single-view videos human body attitude restoration methods based on multi-modal own coding model
CN110047101A (en) Gestures of object estimation method, the method for obtaining dense depth image, related device
CN114663502A (en) Object posture estimation and image processing method and related equipment
CN111968165A (en) Dynamic human body three-dimensional model completion method, device, equipment and medium
CN111553869A (en) Method for complementing generated confrontation network image under space-based view angle
JP2023545189A (en) Image processing methods, devices, and electronic equipment
CN114757904A (en) Surface defect detection method based on AI deep learning algorithm
CN114677479A (en) Natural landscape multi-view three-dimensional reconstruction method based on deep learning
CN113592021B (en) Stereo matching method based on deformable and depth separable convolution
CN115761791A (en) Human body semantic prediction module based on 2D image, virtual clothes changing model and method
CN116385667B (en) Reconstruction method of three-dimensional model, training method and device of texture reconstruction model
CN115953330B (en) Texture optimization method, device, equipment and storage medium for virtual scene image
CN116978057A (en) Human body posture migration method and device in image, computer equipment and storage medium
CN116977683A (en) Object recognition method, apparatus, computer device, storage medium, and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant