CN112822479A - Depth map generation method and device for 2D-3D video conversion - Google Patents

Depth map generation method and device for 2D-3D video conversion Download PDF

Info

Publication number
CN112822479A
CN112822479A CN202011628929.0A CN202011628929A CN112822479A CN 112822479 A CN112822479 A CN 112822479A CN 202011628929 A CN202011628929 A CN 202011628929A CN 112822479 A CN112822479 A CN 112822479A
Authority
CN
China
Prior art keywords
depth
background
image
foreground
depth map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011628929.0A
Other languages
Chinese (zh)
Inventor
张现丰
刘海军
王璇章
庄庄
聂耳
钱炫羲
张雄飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Hualu Media Information Technology Co ltd
China Hualu Group Co Ltd
Original Assignee
Beijing Hualu Media Information Technology Co ltd
China Hualu Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Hualu Media Information Technology Co ltd, China Hualu Group Co Ltd filed Critical Beijing Hualu Media Information Technology Co ltd
Priority to CN202011628929.0A priority Critical patent/CN112822479A/en
Publication of CN112822479A publication Critical patent/CN112822479A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/271Image signal generators wherein the generated image signals comprise depth maps or disparity maps
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/286Image signal generators having separate monoscopic and stereoscopic modes
    • H04N13/289Switching between monoscopic and stereoscopic modes

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of video and multimedia signal processing, and particularly relates to a depth map generation method and a depth map generation device for 2D-3D video conversion, wherein the method comprises the following steps: step 1: processing the 2D video frame by frame to obtain an original video frame sequence; step 2: processing the video frame sequence in the step 1 by adopting a Gaussian background modeling method and a background subtraction method to obtain a foreground image and a background image; and step 3: acquiring a motion vector in the foreground image in the step 2 by adopting an optical flow method, and obtaining a foreground depth map according to the relation between the motion vector and the depth; and 4, step 4: performing depth assignment on the background image in the step 2 by extracting a vanishing line and a vanishing point of the background image by adopting a geometric perspective method to obtain a background depth map; and 5: and fusing the foreground depth map in the step 3 and the background depth map in the step 4 by adopting a depth fusion method to obtain a fusion depth map. The method not only improves the quality of the depth map, but also has wider application range.

Description

Depth map generation method and device for 2D-3D video conversion
Technical Field
The invention belongs to the technical field of video and multimedia signal processing, and particularly relates to a depth map generation method and device for 2D-3D video conversion.
Background
With the rapid development of computer technology and communication technology in this year, the change of scientific technology brings the change of covering the earth to people's life, in the aspect of communication technology, 3D stereoscopic video is characterized by clear and colorful transmitted pictures, wherein things can be drifted outside and can be deeply hidden, and the characteristic of strong spatial sense is deeply pursued by people, and the visual fatigue caused by traditional 2D television plane painting for many years brings incomparable visual enjoyment and strong visual impact to people, so that the 2D video is converted into the 3D video, which has important significance to the development and the transmission of the video.
One of the most important link issues for 2D-3D video conversion is the generation of a depth map. The depth map generation method can be divided into three types of manual, full-automatic and semi-automatic. 1) The method for manually generating the depth map needs to segment each frame in the two-dimensional video and then allocate a depth value to each segmented block, and the depth map obtained by the manual method has high accuracy, but consumes a large amount of manpower and time cost, and is not convenient for large-scale three-dimensional video generation. 2) The principle of the full-automatic depth map generation algorithm by utilizing a plurality of images is to convert a motion vector into a depth map, and the conversion is based on the assumption that an object which moves fast is close to a camera and an object which moves slow is far away from the camera; the overall quality of the depth map obtained by the depth estimation method based on the motion information loses some image information when a smooth image effect is obtained by using the use value filtering, so the effect is relatively poor; although the method can obtain richer depth information, the calculation amount is large, the conversion time consumption is large, and the real-time conversion is not facilitated. 3) The basic principle of the semi-automatic depth map generation algorithm is to divide a video into a plurality of video segments according to scene content, set edge frames between the segments as key frames, and set frames between two key frames as non-key frames.
In view of the above, the present invention provides a depth map generation method and apparatus for 2D-3D video conversion.
Disclosure of Invention
The invention aims to provide a depth map generation method and a depth map generation device for 2D-3D video conversion, which aim to solve the problem that the depth map generation method in the prior art is large in limitation and cannot be widely applied.
The invention provides a depth map generation method for 2D-3D video conversion, which comprises the following steps: step 1: processing the 2D video frame by frame to obtain an original video frame sequence; step 2: processing the video frame sequence by adopting a Gaussian background modeling method and a background subtraction method to obtain a foreground image and a background image; and step 3: acquiring a motion vector in the foreground image by adopting an optical flow method, and obtaining a foreground depth map according to the relation between the motion vector and the depth; and 4, step 4: performing depth assignment on the background image by extracting vanishing lines and vanishing points of the background image by adopting a geometric perspective method to obtain a background depth map; and 5: and fusing the foreground depth map and the background depth map by adopting a depth fusion method to obtain a fusion depth map.
As described above, in the depth map generating method for 2D-3D video conversion, it is further preferable that the step 2 specifically includes: step 2.1: calculating a background update rate based on the mean value and the variance of each pixel point of a current frame in a video frame sequence, and preliminarily separating background pixel points and foreground pixel points based on the background update rate to obtain a foreground separation image and a background separation image; step 2.2: carrying out Gaussian background reconstruction on the background separation image to obtain a background image; step 2.3: and acquiring a differential image of the background image and the original video frame, wherein a binarization image of the differential image is a foreground image.
In the depth map generating method for 2D-3D video conversion as described above, it is further preferable that in step 2.1, the calculation formula of the background update rate is:
Figure BDA0002873605260000021
wherein p is the background change rate, μ is the mean of the pixels in the original video frame, d is the variance of the pixels in the original video frame, e is a natural constant, and x is the abscissa of the pixels in the original video frame.
As described above, in the depth map generating method for 2D-3D video conversion, it is further preferable that, in step 2.2, gaussian background reconstruction is performed based on the following formula, specifically, the reconstruction formula is:
Gb,t(x,y)=p*Gb,t-1(x,y)+(1-p)*Gt(x,y)
where p is the background update rate, Gt(x, y) is the point coordinate of the original video frame at time t, Gb,t(x, y) is the point coordinate of the updated background image at time t, Gb,t-1And (x, y) is the point coordinate of the updated background image at the time t-1.
As described above, in the depth map generating method for 2D-3D video conversion, it is further preferable that, in step 2.3, the difference image is obtained based on the following formula, specifically, the formula is:
Dk(x,y)=|Ik(x,y)-Gb,t(x,y)|,
wherein D isk(x, y) are the differential image coordinates, Gb,t(x, y) is the updated background image coordinates obtained in step 2.2, Ik(x, y) are the original video frame coordinates.
As described above, in the depth map generating method for 2D-3D video conversion, it is further preferable that step 3 specifically includes: step 3.1: calculating an optical flow motion vector according to an optical flow motion vector calculation formula, wherein the optical flow motion vector calculation formula is as follows:
V=(ATW2A)-1ATW2b,
wherein,
Figure BDA0002873605260000031
W=diag(w1,w1,…,w1),b=[It1,It2,…,Itn]Tdiag is a diagonal matrix, ATA transpose matrix of a, where n is an nth pixel included in a neighborhood of pixel (i, j), and n is 1,2,3 … n; v is a motion vector;
step 3.2: calculating a foreground depth value based on a depth information calculation formula; the foreground depth calculation formula is as follows:
Figure BDA0002873605260000032
wherein,
Gffor foreground depth, λ is the depth adjustment factor, V (i, j)xIs the motion vector of the pixel point (i, j) in the x-axis direction, V (i, j)yIs the motion vector of pixel (i, j) in the y-axis direction.
As described above, in the depth map generating method for 2D-3D video conversion, preferably, the step 4 specifically includes:
step 4.1: performing edge detection on the video frame sequence obtained in the step 1 by using a Sobel operator to obtain a horizontal edge gradient map, a vertical edge gradient map, and an edge gradient map fused with the horizontal edge gradient map and the vertical edge gradient map;
step 4.2: calculating a gradient level threshold of the edge gradient map in the step 4.1, and obtaining an edge image based on the edge gradient map and the gradient level threshold, wherein the gradient of each pixel point in the edge image is greater than the gradient level threshold;
step 4.3: drawing a straight line in the k-b space for each pixel point in the edge image obtained in the step 4.2 based on a geometric perspective method; assigning values to the pixel points based on the number of straight lines passing through each pixel point; traversing k-b space, defining a maximum value point as a vanishing point, and defining straight lines around the vanishing point as vanishing lines;
step 4.4: calculating to obtain the depth value of the background image based on the depth value calculation formula of the line and point vanishing obtained in the step 4.3, wherein the depth value calculation formula is as follows:
Gb=255-round(round(j*(|xo-yo|)/yo)*255/(|xo-yo|))
wherein G isbIs the background depth, x, of the pixel point between two adjacent blanking lines0Is the abscissa, y, of the intersection of two adjacent vanishing lines0J is a constant and round operation is rounding operation.
In the depth map generating method for 2D-3D video conversion as described above, it is further preferable that in step 5, the fusion formula of the depth fusion method is:
Figure BDA0002873605260000041
wherein G isdFor the final depth map, I (x, y) is the pixel value of point (x, y), Gf(x, y) is the foreground depth of the pixel (x, y), Gb(x, y) is the background depth of pixel (x, y).
The invention also discloses a depth map generating device for 2D-3D video conversion, which is used for realizing the depth map generating method for 2D-3D video conversion, and comprises the following steps: the video acquisition module is used for processing the 2D video frame by frame to obtain an original video frame sequence; the pixel separation module is used for processing the video frame sequence obtained by the video acquisition module by adopting a Gaussian background modeling method and a background subtraction method to obtain a foreground image and a background image; the foreground depth calculation module is used for acquiring the motion vector in the foreground image acquired by the pixel separation module by adopting an optical flow method and acquiring a foreground depth map according to the relationship between the motion vector and the depth; the background depth calculation module is used for carrying out depth assignment on the background image obtained by the pixel separation module by extracting the vanishing line and the vanishing point of the background image by adopting a geometric perspective method to obtain a background depth map; and the depth fusion module is used for fusing the foreground depth map obtained by the foreground depth calculation module and the background depth map obtained by the background depth calculation module by adopting a depth fusion method to obtain a fusion depth map.
Compared with the prior art, the invention has the following advantages:
the invention discloses a depth map generation method and a depth map generation device for 2D-3D video conversion, which mainly separate a moving foreground from a relatively static background by applying the technology of combining a Gaussian background modeling method and a background subtraction method to an input original 2D video image sequence; then, respectively generating a background depth map by the moving foreground and the background; and finally, fusing the foreground and background depth maps to obtain the depth map generation method based on fusion of the foreground and the background. The method for extracting the depth map not only improves the quality of the depth map, but also has wider application range; in addition, compared with a method for acquiring a depth map from a single depth cue, the method is more perfect, has wider application range, effectively utilizes important depth cues in a video scene, and improves the quality of the depth map.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a depth map generation method for 2D-3D video conversion in the present invention;
fig. 2 is a connection diagram of a frame of a depth map generating apparatus for 2D-3D video conversion according to the present invention.
Detailed Description
Fig. 1 is a flowchart of a depth map generating method for 2D-3D video conversion according to the present invention, and specifically, as shown in fig. 1, the present embodiment discloses a depth map generating method for 2D-3D video conversion, which includes the following steps:
step 1: processing the 2D video frame by frame to obtain an original video frame sequence;
step 2: processing the video frame sequence obtained in the step 1 by adopting a Gaussian background modeling method and a background subtraction method to obtain a foreground image and a background image;
and step 3: acquiring the motion vector in the foreground image obtained in the step 2 by adopting an optical flow method, and obtaining a foreground depth map according to the relation between the motion vector and the depth;
and 4, step 4: performing depth assignment on the background image obtained in the step 2 by extracting the vanishing line and the vanishing point of the background image by adopting a geometric perspective method to obtain a background depth map;
and 5: and (4) fusing the foreground depth map obtained in the step (3) and the background depth map obtained in the step (4) by adopting a depth fusion method to obtain a fusion depth map.
In step 1, the obtained video frame is an image with subtitles removed, that is, the subtitles of the 2D video are removed first, and then the 2D video is processed frame by frame.
Further, step 2 specifically includes:
step 2.1: calculating a background update rate based on the mean value and the variance of each pixel point of the current frame in the video frame sequence obtained in the step 1, and preliminarily separating background pixel points and foreground pixel points based on the background update rate to obtain a foreground separation image and a background separation image;
step 2.2: performing Gaussian background reconstruction on the background separation image obtained in the step 2.1 to obtain a background image;
step 2.3: and (3) acquiring a difference image of the background image obtained in the step (2.2) and the original video frame obtained in the step (1), wherein a binarization image of the difference image is a foreground image.
In step 2.1, for a background image, the distribution of the brightness of specific pixels satisfies gaussian distribution, that is, for the background image, the brightness of each pixel (x, y) satisfies Gb(x, y) to N (u, d), where the mean u and variance d are the unique attributes of each point, and
Figure BDA0002873605260000071
wherein p is the background change rate, μ is the mean of the pixels in the original video frame, d is the variance of the pixels in the original video frame, e is a natural constant, is a constant in mathematics, is an infinite acyclic decimal number, and has a value of about 2.72; x is the abscissa of the pixel point in the original video frame; the mean u and the variance d are specific attributes of each pixel point, and when the coordinates of the points change, the mean u and the variance d also change.
The mean u and variance d of each point in the sequence of video frames over a period of time are calculated as the background model B. For an arbitrary video frame sequential image G containing the foreground, for each pixel point (x, y) on the image, calculate if:
Figure BDA0002873605260000072
the point is considered as a background point, otherwise, the point is considered as a foreground point;
wherein T is a constant threshold, the value of T in this embodiment is 3, G (x, y) is a point coordinate in the original video frame image G, and G (x, y) is a point coordinate in the original video frame image Gb(x, y) are point coordinates of the background model B.
In step 2.2, the background pixel points are updated based on the following formula, specifically, the updating formula is:
Gb,t(x,y)=p*Gb,t-1(x,y)+(1-p)*Gt(x,y),
wherein p is a background update rate which is a constant, the larger p is, the slower the background update is, and p is 0.004 in this embodiment; gt(x, y) is the point coordinate of the original video frame at time t, Gb,t(x, y) is the point coordinate of the updated background image at time t, Gb,t-1And (x, y) is the point coordinate of the updated background image at the time t-1. Typically, d changes only slightly after the background update, so d is typically not updated after the background update. Thereby obtaining the result of video background reconstruction.
In step 2.3, the difference image is obtained based on the following formula, specifically:
Dk(x,y)=|Ik(x,y)-Gb,t(x,y)|,
wherein D isk(x, y) isDifferential image coordinates, Gb,t(x, y) is the updated background image coordinates obtained in step 2.2, Ik(x, y) are the original video frame coordinates.
For differential image Dk(x, y) is subjected to binarization processing to obtain Gf(x, y), specifically,
Figure BDA0002873605260000081
Gf(x, y) are foreground image coordinates, Dk(x, y) are the difference image coordinates.
Further, step 3 specifically includes:
step 3.1: calculating an optical flow motion vector according to an optical flow motion vector calculation formula, wherein the optical flow motion vector calculation formula is as follows:
V=(ATW2A)-1ATW2b,
wherein,
Figure BDA0002873605260000082
W=diag(w1,w1,…,w1),b=[It1,It2,…,Itn]Tdiag is a diagonal matrix; a. theTThe matrix is a transpose matrix of a, n is n pixels in the neighborhood of pixel (i, j), n is 1,2,3 … n, and V is a motion vector.
Specifically, according to the basic constraint equation of the optical flow method, assuming that the brightness value of a pixel point (I, j) in an image at time t is I (I, j, t), u (I, j) and v (I, j) represent the motion components of the optical flow at this point in the x and y directions, it can be derived:
Iiu+Ijv+It=0,
wherein I is the abscissa of the pixel, j is the ordinate of the pixel, I is the gray level of the pixel, IiIs the rate of change of the image gray scale with i,
Figure BDA0002873605260000083
Ijis the rate of change of the image gray scale with j,
Figure BDA0002873605260000084
is the rate of change of the image gray scale with time t; u denotes a moving speed of the reference point in the i direction,
Figure BDA0002873605260000085
v represents the speed of movement of the reference point in the j direction,
Figure BDA0002873605260000086
step 3.2: calculating a foreground depth value based on a depth information calculation formula; the foreground depth calculation formula is as follows:
Figure BDA0002873605260000087
wherein,
λ is depth adjustment coefficient, V (i, j)x,V(i,j)yIs the motion vector of pixel (i, j); v is a motion vector, V is (u, V)T. λ is a depth adjustment coefficient. The depth of the entire depth frame is resized by adjusting the size of λ. In order to obtain a three-dimensional video with good parallax effect, the method comprises the steps of
Figure BDA0002873605260000091
Where max (v) is the size of the largest motion vector in the extracted motion vector field, the depth map is a grayscale map with a range of 0-255, and 255 is the maximum value of the depth map. Motion vector V (i, j)x,V(i,j)yIs the motion vector for pixel (i, j). And finally obtaining a depth map of the foreground through depth information assignment.
Further, step 4 specifically includes:
step 4.1: performing edge detection on the video frame sequence obtained in the step 1 by using a Sobel operator to obtain a horizontal edge gradient map, a vertical edge gradient map, and an edge gradient map fused with the horizontal edge gradient map and the vertical edge gradient map;
step 4.2: calculating a gradient level threshold of the edge gradient map in the step 4.1, and obtaining an edge image based on the edge gradient map and the gradient level threshold, wherein the gradient of each pixel point in the edge image is greater than the gradient level threshold;
step 4.3: drawing a straight line in the k-b space for each pixel point in the edge image obtained in the step 4.2 based on a geometric perspective method; assigning values to the pixel points based on the number of straight lines passing through each pixel point; traversing k-b space, defining a maximum value point as a vanishing point, and defining straight lines around the vanishing point as vanishing lines;
step 4.4: calculating to obtain the depth value of the background image based on the depth value calculation formula of the line and point vanishing obtained in the step 4.3, wherein the depth value calculation formula is as follows:
Gb=255-round(round(j*(|xo-yo|)/yo)*255/(|xo-yo|))
wherein G isbIs the background depth, x, of the pixel point between two adjacent blanking lines0Is the abscissa, y, of the intersection of two adjacent vanishing lines0J is a constant and round operation is rounding operation.
In step 4.2, the gradient threshold calculation formula is:
Ct=α*[Smax(x,y)-Smin(x,y)]+Smin(x,y),
wherein alpha is a weight coefficient and has a value ranging from 0 to 1, Smax(x, y) is the maximum in the edge gradient map, Smin(x, y) is the minimum in the edge gradient map.
In step 4.3, k-b space is a term, and the truncated form is y ═ kx + b, which refers to the plane where the straight line is located.
Further, in step 5, the fusion formula of the depth fusion method is as follows:
Figure BDA0002873605260000101
wherein G isdFor the final depth map, I (x, y) is the pixel value of point (x, y).
Depth value G of final depth mapdThe values of (A) are divided into cases: when a certainWhen the pixel value of the point I (x, y) is 255, the moving foreground area is judged, and finally the depth value G of the foreground depth map is obtainedfAssigning the depth value to the final depth map, and when the pixel value of the point is 0, the depth value of the final depth map is the depth value G of the background depth mapb. And finally obtaining the fused depth map.
Fig. 2 is a frame connection diagram of a depth map generating device for 2D-3D video conversion according to the present invention, and as shown in fig. 2, this embodiment further discloses a depth map generating device for 2D-3D video conversion, which is used to implement the depth map generating method for 2D-3D video conversion described in embodiment 1, and includes:
the video acquisition module is used for processing the 2D video frame by frame to obtain an original video frame sequence;
the pixel separation module is used for processing the video frame sequence obtained by the video acquisition module by adopting a Gaussian background modeling method and a background subtraction method to obtain a foreground image and a background image;
the foreground depth calculation module is used for acquiring the motion vector in the foreground image acquired by the pixel separation module by adopting an optical flow method and acquiring a foreground depth map according to the relationship between the motion vector and the depth;
the background depth calculation module is used for carrying out depth assignment on the background image obtained by the pixel separation module by extracting the vanishing line and the vanishing point of the background image by adopting a geometric perspective method to obtain a background depth map;
and the depth fusion module is used for fusing the foreground depth map obtained by the foreground depth calculation module and the background depth map obtained by the background depth calculation module by adopting a depth fusion method to obtain a fusion depth map.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A depth map generation method for 2D-3D video conversion, comprising the steps of:
step 1: processing the 2D video frame by frame to obtain an original video frame sequence;
step 2: processing the video frame sequence obtained in the step 1 by adopting a Gaussian background modeling method and a background subtraction method to obtain a foreground image and a background image;
and step 3: acquiring the motion vector in the foreground image obtained in the step 2 by adopting an optical flow method, and obtaining a foreground depth map according to the relation between the motion vector and the depth;
and 4, step 4: performing depth assignment on the background image obtained in the step 2 by extracting the vanishing line and the vanishing point of the background image by adopting a geometric perspective method to obtain a background depth map;
and 5: and (4) fusing the foreground depth map obtained in the step (3) and the background depth map obtained in the step (4) by adopting a depth fusion method to obtain a fusion depth map.
2. The method according to claim 1, wherein the step 2 specifically comprises:
step 2.1: calculating a background update rate based on the mean value and the variance of each pixel point of the current frame in the video frame sequence obtained in the step 1, and preliminarily separating background pixel points and foreground pixel points based on the background update rate to obtain a foreground separation image and a background separation image;
step 2.2: performing Gaussian background reconstruction on the background separation image obtained in the step 2.1 to obtain a background image;
step 2.3: and (3) acquiring a difference image of the background image obtained in the step (2.2) and the original video frame obtained in the step (1), wherein a binarization image of the difference image is a foreground image.
3. The method of claim 2, wherein in step 2.1, the background update rate is calculated by the formula:
Figure FDA0002873605250000021
wherein p is the background change rate, μ is the mean of the pixels in the original video frame, d is the variance of the pixels in the original video frame, e is a natural constant, and x is the abscissa of the pixels in the original video frame.
4. The method according to claim 3, wherein in step 2.2, the Gaussian background reconstruction is performed based on the following formula, specifically, the reconstruction formula is:
Gb,t(x,y)=p*Gb,t-1(x,y)+(1-p)*Gt(x,y)
where p is the background update rate, Gt(x, y) is the point coordinate of the original video frame at time t, Gb,t(x, y) is the point coordinate of the updated background image at time t, Gb,t-1And (x, y) is the point coordinate of the updated background image at the time t-1.
5. The method according to claim 4, wherein in step 2.3, the difference image is obtained based on the following formula, specifically:
Dk(x,y)=|Ik(x,y)-Gb,t(x,y)|,
wherein D isk(x, y) are the differential image coordinates, Gb,t(x, y) is the updated background image coordinates obtained in step 2.2, Ik(x, y) are the original video frame coordinates.
6. The method according to claim 5, wherein step 3 specifically comprises:
step 3.1: calculating an optical flow motion vector according to an optical flow motion vector calculation formula, wherein the optical flow motion vector calculation formula is as follows:
V=(ATW2A)-1ATW2b,
wherein,
Figure FDA0002873605250000022
W=diag(w1,w1,…,w1),b=[It1,It2,…,Itn]Tdiag is a diagonal matrix, ATA transpose matrix of a, where n is an nth pixel included in a neighborhood of pixel (i, j), and n is 1,2,3 … n; v is a motion vector;
step 3.2: calculating a foreground depth value based on a depth information calculation formula; the foreground depth calculation formula is as follows:
Figure FDA0002873605250000031
wherein,
Gffor foreground depth, λ is the depth adjustment factor, V (i, j)xIs the motion vector of the pixel point (i, j) in the x-axis direction, V (i, j)yIs the motion vector of the pixel point (i, j) in the y-axis direction.
7. The method according to claim 6, wherein step 4 specifically comprises:
step 4.1: performing edge detection on the video frame sequence obtained in the step 1 by using a Sobel operator to obtain a horizontal edge gradient map, a vertical edge gradient map, and an edge gradient map fused with the horizontal edge gradient map and the vertical edge gradient map;
step 4.2: calculating a gradient level threshold of the edge gradient map in the step 4.1, and obtaining an edge image based on the edge gradient map and the gradient level threshold, wherein the gradient of each pixel point in the edge image is greater than the gradient level threshold;
step 4.3: drawing a straight line in the k-b space for each pixel point in the edge image obtained in the step 4.2 based on a geometric perspective method; assigning values to the pixel points based on the number of straight lines passing through each pixel point; traversing k-b space, defining a maximum value point as a vanishing point, and defining straight lines around the vanishing point as vanishing lines;
step 4.4: calculating to obtain the depth value of the background image based on the depth value calculation formula of the line and point vanishing obtained in the step 4.3, wherein the depth value calculation formula is as follows:
Gb=255-round(round(j*(|xo-yo|)/yo)*255/(|xo-yo|))
wherein G isbIs the background depth, x, of the pixel point between two adjacent blanking lines0Is the abscissa, y, of the intersection of two adjacent vanishing lines0J is a constant and round operation is rounding operation.
8. The method of claim 7, wherein in step 5, the fusion formula of the depth fusion method is:
Figure FDA0002873605250000041
wherein G isdFor the final depth map, I (x, y) is the pixel value of point (x, y), Gf(x, y) is the foreground depth of the pixel (x, y), Gb(x, y) is the background depth of pixel (x, y).
9. A depth map generating apparatus for 2D-3D video conversion, characterized in that, the depth map generating method for 2D-3D video conversion according to any one of claims 1-8 is implemented, and comprises:
the video acquisition module is used for processing the 2D video frame by frame to obtain an original video frame sequence;
the pixel separation module is used for processing the video frame sequence obtained by the video acquisition module by adopting a Gaussian background modeling method and a background subtraction method to obtain a foreground image and a background image;
the foreground depth calculation module is used for acquiring the motion vector in the foreground image acquired by the pixel separation module by adopting an optical flow method and acquiring a foreground depth map according to the relationship between the motion vector and the depth;
the background depth calculation module is used for carrying out depth assignment on the background image obtained by the pixel separation module by extracting the vanishing line and the vanishing point of the background image by adopting a geometric perspective method to obtain a background depth map;
and the depth fusion module is used for fusing the foreground depth map obtained by the foreground depth calculation module and the background depth map obtained by the background depth calculation module by adopting a depth fusion method to obtain a fusion depth map.
CN202011628929.0A 2020-12-30 2020-12-30 Depth map generation method and device for 2D-3D video conversion Pending CN112822479A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011628929.0A CN112822479A (en) 2020-12-30 2020-12-30 Depth map generation method and device for 2D-3D video conversion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011628929.0A CN112822479A (en) 2020-12-30 2020-12-30 Depth map generation method and device for 2D-3D video conversion

Publications (1)

Publication Number Publication Date
CN112822479A true CN112822479A (en) 2021-05-18

Family

ID=75855061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011628929.0A Pending CN112822479A (en) 2020-12-30 2020-12-30 Depth map generation method and device for 2D-3D video conversion

Country Status (1)

Country Link
CN (1) CN112822479A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113284081A (en) * 2021-07-20 2021-08-20 杭州小影创新科技股份有限公司 Depth map super-resolution optimization method and device, processing equipment and storage medium
CN114581611A (en) * 2022-04-28 2022-06-03 阿里巴巴(中国)有限公司 Virtual scene construction method and device
CN115601233A (en) * 2022-12-14 2023-01-13 南京诺源医疗器械有限公司(Cn) Method for converting 2D (two-dimensional) image into 3D (three-dimensional) image of medical image

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101640809A (en) * 2009-08-17 2010-02-03 浙江大学 Depth extraction method of merging motion information and geometric information
CN103366332A (en) * 2013-06-18 2013-10-23 河海大学 Depth information-based image watermarking method
CN111062263A (en) * 2019-11-27 2020-04-24 杭州易现先进科技有限公司 Method, device, computer device and storage medium for hand pose estimation
CN111476156A (en) * 2020-04-07 2020-07-31 上海龙晶科技有限公司 Real-time intelligent monitoring algorithm for mice and other small animals

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101640809A (en) * 2009-08-17 2010-02-03 浙江大学 Depth extraction method of merging motion information and geometric information
CN103366332A (en) * 2013-06-18 2013-10-23 河海大学 Depth information-based image watermarking method
CN111062263A (en) * 2019-11-27 2020-04-24 杭州易现先进科技有限公司 Method, device, computer device and storage medium for hand pose estimation
CN111476156A (en) * 2020-04-07 2020-07-31 上海龙晶科技有限公司 Real-time intelligent monitoring algorithm for mice and other small animals

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周杨等: "基于光流与Mean Shift算法的运动目标检测", 《信息技术》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113284081A (en) * 2021-07-20 2021-08-20 杭州小影创新科技股份有限公司 Depth map super-resolution optimization method and device, processing equipment and storage medium
CN114581611A (en) * 2022-04-28 2022-06-03 阿里巴巴(中国)有限公司 Virtual scene construction method and device
CN115601233A (en) * 2022-12-14 2023-01-13 南京诺源医疗器械有限公司(Cn) Method for converting 2D (two-dimensional) image into 3D (three-dimensional) image of medical image

Similar Documents

Publication Publication Date Title
CN109003325B (en) Three-dimensional reconstruction method, medium, device and computing equipment
CN112822479A (en) Depth map generation method and device for 2D-3D video conversion
CN101640809B (en) Depth extraction method of merging motion information and geometric information
CN102741879B (en) Method for generating depth maps from monocular images and systems using the same
CN102663766B (en) Non-photorealistic based art illustration effect drawing method
CN106875437B (en) RGBD three-dimensional reconstruction-oriented key frame extraction method
US20040032488A1 (en) Image conversion and encoding techniques
CN101287142A (en) Method for converting flat video to tridimensional video based on bidirectional tracing and characteristic points correction
WO2018053952A1 (en) Video image depth extraction method based on scene sample library
CN103826032B (en) Depth map post-processing method
CN109712247B (en) Live-action training system based on mixed reality technology
WO2014121108A1 (en) Methods for converting two-dimensional images into three-dimensional images
Yan et al. Depth map generation for 2d-to-3d conversion by limited user inputs and depth propagation
CN106447718B (en) A kind of 2D turns 3D depth estimation method
CN107730472A (en) A kind of image defogging optimized algorithm based on dark primary priori
CN111899295A (en) Monocular scene depth prediction method based on deep learning
CN104159098B (en) The translucent edge extracting method of time domain consistence of a kind of video
Zhang et al. Interactive stereoscopic video conversion
KR101125061B1 (en) A Method For Transforming 2D Video To 3D Video By Using LDI Method
Wang et al. Example-based video stereolization with foreground segmentation and depth propagation
Mathai et al. Automatic 2D to 3D video and image conversion based on global depth map
Liu et al. Fog effect for photography using stereo vision
CN110149508A (en) A kind of array of figure generation and complementing method based on one-dimensional integrated imaging system
KR102648882B1 (en) Method for lighting 3D map medeling data
CN112598777B (en) Haze fusion method based on dark channel prior

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210518