CN111447428A - Method and device for converting plane image into three-dimensional image, computer readable storage medium and equipment - Google Patents

Method and device for converting plane image into three-dimensional image, computer readable storage medium and equipment Download PDF

Info

Publication number
CN111447428A
CN111447428A CN202010172617.7A CN202010172617A CN111447428A CN 111447428 A CN111447428 A CN 111447428A CN 202010172617 A CN202010172617 A CN 202010172617A CN 111447428 A CN111447428 A CN 111447428A
Authority
CN
China
Prior art keywords
image
depth map
background
foreground
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010172617.7A
Other languages
Chinese (zh)
Inventor
黄胜海
钟伦超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202010172617.7A priority Critical patent/CN111447428A/en
Publication of CN111447428A publication Critical patent/CN111447428A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/128Adjusting depth or disparity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/122Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/167Synchronising or controlling image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/261Image signal generators with monoscopic-to-stereoscopic image conversion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0096Synchronisation or controlling aspects

Abstract

The application discloses an image processing method, an image processing device, a computer readable storage medium and an image processing device, wherein the image processing method comprises the following steps: acquiring a 2D image to be processed; obtaining a texture depth map of the 2D image according to the texture features of the 2D image; dividing the 2D image into a foreground region and a background region; acquiring a motion depth map of a foreground area and a linear depth map of a background area; fusing the motion depth map and the texture depth map to obtain a foreground depth map, and fusing the linear depth map and the texture depth map to obtain a background depth map; fusing the foreground depth map and the background depth map to obtain a depth map of the 2D image; and fusing the 2D image and the depth map of the 2D image to obtain an image with stereoscopic sensation. Compared with the traditional method for extracting the depth value singly, the method for comprehensively extracting the depth value has the advantages that the obtained depth value is more accurate and the precision is higher, so the effect is better when a 2D image is converted into a 3D image or a VR image.

Description

Method and device for converting plane image into three-dimensional image, computer readable storage medium and equipment
Technical Field
The present application relates to the field of image processing, and in particular, to an image conversion method, an image conversion apparatus, a computer-readable storage medium, and a computer-readable storage device.
Background
With the increasing maturity of digital video technology and the rapid development of image graphics technology, computer technology, electronic information technology, simulation technology, video compression coding technology, broadcasting technology and display technology. 3D technology and VR technology develop very rapidly. The combination of a single color camera shooting a video and a 2D to 3D conversion technology is a major source of current 3D content.
The 2D to 3D conversion technology needs to apply a depth extraction algorithm, and the existing depth extraction algorithm may include: binocular parallax, defocus, etc. The binocular parallax is used for searching corresponding pixels of two images shot at different angles by using a stereo matching method, the pixel difference of the two image pairs is calculated, the larger the difference value is, the closer the scene is, and then the difference value is converted into the scene depth. The defocus method requires a defocus phenomenon in one of the foreground and background of the scene and is not suitable for an image with blurred texture. Therefore, the depth extraction algorithms have low accuracy of the obtained image depth value, and therefore, the effect is not good when the 2D image is converted into 3D or VR.
Disclosure of Invention
The invention aims to provide an image conversion method, an image conversion device, a computer readable storage medium and equipment, which can improve the depth value precision of a 2D image and enable the effect to be better when the 2D image is converted into a 3D image or a VR image.
In particular, the present invention provides an image conversion method comprising the steps of:
step 100, obtaining a 2D image to be processed, and obtaining a texture depth map of the 2D image according to texture features of the 2D image;
step 200, dividing the 2D image into a foreground area and a background area according to a set requirement, and respectively obtaining a motion depth map of the foreground area and a linear depth map of the background area;
step 300, fusing the motion depth map and the texture depth map to obtain a foreground depth map, fusing the linear depth map and the texture depth map to obtain a background depth map, and fusing the foreground depth map and the background depth map to obtain a depth map of the 2D image;
and step 400, fusing the 2D image and the depth map of the 2D image to obtain an image with stereoscopic impression.
In an embodiment of the present invention, the process of dividing the 2D image into a foreground region and a background region according to a given requirement is as follows:
step 201, the established requirements are that an area formed by a static object in the 2D image is taken as a background area, and an area formed by a moving object in the 2D image is taken as a foreground area;
step 202, preprocessing the 2D image, establishing a corresponding background image, and acquiring a pixel value of the background image;
step 203, differentiating the 2D image and the background image, and determining a foreground region according to a differential value;
and 204, determining an image except the foreground region in the 2D image as a background region.
In an embodiment of the present invention, before performing the step 400, a step of smoothing and filtering the depth map of the 2D image is further included.
In one embodiment of the present invention, the image having stereoscopic sensation in step 400 includes: left or right view for 3D video; or, a disparity map suitable for VR video.
In one embodiment of the present invention, there is provided an image conversion apparatus including:
the image acquisition module is used for acquiring a 2D image to be processed;
the image processing module is used for dividing the 2D image into a foreground area and a background area according to a set requirement;
the depth map obtaining module is used for obtaining a texture depth map of the 2D image according to the texture features of the 2D image; acquiring a motion depth map of the foreground area and a linear depth map of the background area; fusing the motion depth map and the texture depth map to obtain a foreground depth map, and fusing the linear depth map and the texture depth map to obtain a background depth map; fusing the foreground depth map and the background depth map to obtain a depth map of the 2D image;
and the image conversion module is used for fusing the 2D image and the depth map of the 2D image to obtain an image with stereoscopic impression.
In one embodiment of the present invention, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, adopts the aforementioned image conversion method.
In one embodiment of the invention, an image conversion device is provided, which comprises a base for plugging a 2D image device to be converted, an HDMI interface and a USB interface,
the USB interface is used for transmitting an external 2D image to be processed to the base;
the base comprises a core component for reading 2D images, wherein the core component comprises a depth map generation module: the depth map generation module can obtain a texture depth map of the 2D image according to texture features of the 2D image, then divide the 2D image into a foreground region and a background region according to established requirements, obtain a motion depth map of the foreground region and a linear depth map of the background region, fuse the motion depth map and the texture depth map to obtain a foreground depth map, fuse the linear depth map and the texture depth map to obtain a background depth map, and fuse the foreground depth map and the background depth map to obtain a depth map of the 2D image; and
the stereo image generation module is used for fusing the 2D image and the depth image of the 2D image to obtain an image with stereoscopic sensation;
the HDMI is used for transmitting the stereoscopic images to external equipment.
In one embodiment of the present invention, the core assembly further comprises:
the signal conversion module is used for converting the 2D image from an RGB mode to a YUV mode;
an image adjustment module for reducing the 2D image;
the motion estimation module is used for carrying out motion estimation on the 2D image to obtain a corresponding motion vector; the motion vector is used for generating a depth map of the 2D image;
the depth map adjusting module is used for smoothing and amplifying the depth map of the 2D image;
and the filtering module is used for carrying out filtering processing on the depth map of the 2D image.
In one embodiment of the invention, the base further comprises a configuration assembly comprising a transport module and a synchronization module;
the transmission module is used for transmitting the received configuration information for processing the 2D image to the synchronization module;
and the synchronization module is used for converting the configuration information into a clock domain and transmitting the converted configuration information of the pixel clock domain to the core component.
In one embodiment of the invention, the image conversion device is operable in a configuration mode and an operational mode:
in the configuration mode, an external CPU can perform read-write operation on a register of the depth map generation module through an APB bus;
in the running mode, the external CPU can not perform read-write operation on the register of the depth map generation module; and after one frame of image is processed, the operation mode can be switched to the configuration mode.
According to the image processing method, the image processing device, the computer readable storage medium and the computer readable storage equipment, the image is divided into the foreground area and the background area to respectively extract the depth values, then the texture depth values obtained by the texture features of the image are integrated, and the depth values of the whole image are obtained through fusion.
Drawings
FIG. 1 is a flow diagram of a method for image conversion in one embodiment;
FIG. 2 is an exploded flow diagram of a method for image conversion in one embodiment;
FIG. 3 is a flow diagram of partitioning a 2D image into foreground and background regions in one embodiment;
FIG. 4 is a flow diagram illustrating an exemplary embodiment of an image conversion apparatus;
FIG. 5 is a schematic diagram showing the configuration of an image conversion apparatus according to an embodiment;
figure 6 is a block diagram of a portion of the hardware of the base in one embodiment.
Detailed Description
As shown in fig. 1, one embodiment of the present invention discloses an image conversion method, including the steps of:
step 100, obtaining a 2D image to be processed, and obtaining a texture depth map of the 2D image according to texture features of the 2D image;
step 200, dividing the 2D image into a foreground area and a background area according to a set requirement, and respectively obtaining a motion depth map of the foreground area and a linear depth map of the background area;
step 300, fusing the motion depth map and the texture depth map to obtain a foreground depth map, fusing the linear depth map and the texture depth map to obtain a background depth map, and fusing the foreground depth map and the background depth map to obtain a depth map of the 2D image;
and step 400, fusing the 2D image and the depth map of the 2D image to obtain an image with stereoscopic impression.
Example 1:
acquiring a 2D image to be processed, wherein the 2D image to be processed is a 2D image which needs to be converted into a stereoscopic image. Specifically, the 2D image to be processed may be an image in a game or a video. The 2D image to be processed is an image captured by a still camera.
After the image to be processed is acquired, the texture features of the 2D image can be extracted. The texture feature is a visual feature reflecting homogeneity phenomenon in an image, and embodies the tissue arrangement attribute of a surface structure with slow change or periodic change on the surface of an object. Usually, the texture features have the characteristics of repeated local sequence, non-random arrangement and approximately uniform unity in texture areas. A common method for extracting image texture features may include: basic texel-based geometry; the texture is assumed to be a model method formed in a distributed model mode controlled by certain parameters; after a certain transformation is carried out in a certain area in the texture image, a characteristic value which keeps relatively stable is extracted, and the characteristic value is used as a signal processing method for representing the consistency in the area and the dissimilarity between the areas by the characteristic value; a texture method based on regularity of texture, and the like. After the texture features of the 2D image are obtained, a corresponding texture depth map may be further obtained according to the texture features of the 2D image.
The 2D image may be further divided into a foreground region and a background region. The background area refers to an area displayed by a static object in the picture, and the foreground area refers to an area displayed by a moving object in the picture.
As shown in fig. 2, a description is given of a working process flow of converting 2D into a stereoscopic image, wherein step 101 acquires a 2D image to be processed; 102, dividing a foreground area and a background area of the 2D image to be processed; 103, extracting a texture depth map based on texture features of the 2D image of the image to be processed; 104, extracting a motion depth map from the foreground area; step 105, extracting a linear depth map from a background area; step 106, obtaining a foreground depth map; step 107, obtaining a background depth map; step 108, fusing the foreground depth map and the background depth map to obtain a depth map of the 2D image; step 109, acquiring a 2D image corresponding to the depth map; step 110 performs virtual viewpoint synthesis on the depth map of the 2D image and the corresponding 2D image to obtain a stereoscopic image.
FIG. 3 is a flow diagram for partitioning a 2D image into foreground and background regions in one embodiment. As shown in fig. 2, step 201, a 2D image to be processed is acquired. Step 202, preprocessing the 2D image and establishing a corresponding background area. And 203, differentiating the 2D image and the background area to obtain a foreground area of the 2D image. And 204, performing post-processing on the foreground area, wherein the post-processing comprises filtering, smoothing, noise processing and the like. Step 205, determining a foreground region, and determining the part outside the foreground region in the 2D image as a background region.
Specifically, the 2D image may be divided into a foreground region and a background region by a background subtraction method, which includes: and preprocessing the 2D image, establishing a corresponding background image, and acquiring a pixel value of the background image. And differentiating the 2D image and the background image, and determining a foreground area according to the differential value. And determining the image except the foreground area in the 2D image as a background area.
Background subtraction first requires establishing a background image for reference, and differentiating the 2D image to be processed with the background image to determine a foreground region and a background region. Wherein, establishing the background image may include: the method needs to update the background manually, and provides background information of the original monitoring scene without moving objects at the initial stage of model creation, and the model of the method has limited application conditions. The other method is an adaptive background image which is automatically updated according to the change of an actual video scene, and a common method is a Gaussian mixture modeling method. Such as a K-means based gaussian mixture model that is not susceptible to background variations. And establishing a multi-distributed Gaussian model for each pixel by the Gaussian mixture model for representing the gray value change of the pixel point, judging a background image according to the weight and the variance, and matching each frame of 2D image to be processed with the background images to determine which pixels in the 2D image to be processed belong to the foreground area. Due to the dynamic change of the scene, the weight, the mean value and the variance of each distribution also need to be updated in real time to adapt to the change of the environment. Aiming at the problems that the initialization of the traditional Gaussian mixture model is inaccurate by adopting a first frame image during the initialization of the model, the fixed value of the Gaussian mixture distribution number wastes the resource cost, the calculation efficiency is low and the like, a crystal tensioning method can be adopted to establish a background image.
After determining the foreground region on the 2D image to be processed, post-processing needs to be performed on the foreground region, and the post-processing may specifically include shadow detection, noise detection, smoothing, filtering, and the like.
Shadows are created by objects in the scene that block the light source. When the scenery in the image goes away from the direction of light source, there will be shadow on its surface, and when the scenery is not transparent, there will be shadow cast on the blocked part. A general shadow segmentation algorithm is to segment the drop shadow. The present application adopts a shadow detection method in HSV (Hue, Saturation, Value brightness) space. The HSV color model gives three attributes, hue, saturation, and brightness, according to human visual characteristics. H is one or more visual perception attributes with the surface presenting colors similar to red, yellow, green, blue and the like; s is the degree to which the color has "white light"; v is the relative independence of the object surface from the HSV color space, which can be manipulated separately in the computation. The HSV color space can be closer to the color vision of people, can also show the information of image gray scale and color, and can well obtain the exact position of shadow aiming at the scene with special brightness or special darkness in the picture.
Generally, an image is segmented by a target, reflection of light exists or the gray level difference between an object and a background is not large, a horizontal or vertical fault phenomenon inevitably exists in an obtained foreground region, the edge of the segmented region is rough, the target region always has more or less holes or more or less holes, and meanwhile, some noises can be segmented as a moving target and also become isolated noises. The noise in the foreground region and the isolated noise have a great influence on the post-processing of the image and must be processed.
For processing noise, a filtering algorithm is usually selected, and many commonly used filtering algorithms are: mean, gaussian, morphological filtering, median, etc., but different filters operate on different noise to achieve different effects. The median filtering mainly solves the problem of pulse type noise, and can well protect the image edge from being blurred; gaussian filtering is typically used to handle gaussian noise; band pass filtering may smooth or sharpen the image; the morphological filtering can well filter random noise, reduce the influence on the image, and can segment or connect adjacent regions in the image, thereby facilitating the post-processing of the image. In the application, the difference image is obtained by adopting a background subtraction method, random noise and a cavity phenomenon in a foreground region can be generated, and therefore mathematical morphology filtering can be selected to process the image.
And acquiring a motion depth map of the foreground area and a linear depth map of the background area.
And fusing the motion depth map and the texture depth map to obtain a foreground depth map, and fusing the linear depth map and the texture depth map to obtain a background depth map.
And fusing the foreground depth map and the background depth map to obtain a depth map of the 2D image.
Obtaining the motion depth map of the foreground region and the linear depth map of the background region may include: for the foreground area, obtaining a motion depth map of the foreground area by adopting a high-precision optical flow method; and for the background area, assigning values to the background area according to the principle of geometric perspective to obtain a linear depth map with far and near layering.
After the motion depth map of the foreground region and the linear depth map of the background region are obtained, the motion depth map and the linear depth map may be fused with the texture depth map, respectively. Specifically, the motion depth map and the linear depth map are respectively subjected to weighted fusion with the texture depth map. For example, when the pixel value of a certain point in the image is 255, the certain point is determined as a foreground region, and the depth value is obtained by weighting the corresponding depth value in the motion depth map and the corresponding depth value in the texture depth map; when the pixel value of a certain point in the image is 0, the depth value is obtained by weighting the corresponding depth value in the linear depth map and the corresponding depth value in the texture depth map.
And after the foreground depth map and the background depth map of the 2D image to be processed are obtained, fusing the foreground depth map and the background depth map to obtain a depth map of the 2D image.
And fusing the 2D image and the depth map of the 2D image to obtain an image with stereoscopic sensation.
Specifically, the position coordinates corresponding to each pixel point in the original three-dimensional space are calculated according to the depth map of the 2D image, that is, the two-dimensional scene image is mapped into the three-dimensional space, and then the three-dimensional coordinates are mapped into the two-dimensional coordinates of the target viewpoint through operations such as translation and rotation according to various parameters of the image, so as to obtain the image of the target viewpoint.
According to the setting parameters, the 2D image can be converted into an image with stereoscopic effect. The above-described stereoscopic images may include a left view image or a right view image for 3D video, and an image with parallax for VR (Virtual Reality) video. Specifically, a virtual viewpoint view synthesis technology Based on DIBR (Depth-image Based Rendering) may be adopted, and a corresponding Depth map sequence is added to the original 2D video, so that an image of any viewpoint is produced at the transmission terminal as required, and the processing process is faster than that of two paths of 2D videos, and the transmission space is small.
Further, the 2D image in the video can be converted into an image with stereoscopic effect by performing the above processing frame by frame, i.e., the 2D video can be converted into a 3D video or a VR video.
According to the image processing method in the implementation of the application, the image is divided into the foreground area and the background area to extract the depth values respectively, then the texture depth values obtained through the texture features of the image are integrated, and the depth values of the whole image are obtained through fusion.
Furthermore, in the method, the 2D image can be converted into a left view or a right view corresponding to the 3D video and also can be converted into a parallax image corresponding to the VR video, so that convenience in image conversion is realized.
It should be understood that, although the steps in the respective flow charts described above are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the various flow diagrams described above may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least a portion of the sub-steps or stages of other steps.
Example 2:
as shown in fig. 4, there is disclosed an image conversion apparatus including:
an image obtaining module 301, configured to obtain a 2D image to be processed.
An image processing module 302 for dividing the 2D image into a foreground region and a background region.
The depth map obtaining module 303 is configured to obtain a texture depth map of the 2D image according to the texture features of the 2D image; acquiring a motion depth map of a foreground area and a linear depth map of a background area; fusing the motion depth map and the texture depth map to obtain a foreground depth map, and fusing the linear depth map and the texture depth map to obtain a background depth map; and fusing the foreground depth map and the background depth map to obtain a depth map of the 2D image.
And the image conversion module 304 is configured to fuse the 2D image and the depth map of the 2D image to obtain an image with stereoscopic effect.
In one embodiment, the image processing module 302 dividing the 2D image into a foreground region and a background region comprises: preprocessing the 2D image, establishing a corresponding background image, and acquiring a pixel value of the background image; differentiating the 2D image and the background image, and determining a foreground area according to a differential value; and determining the image except the foreground area in the 2D image as a background area.
In one embodiment, the image processing module 302 is further configured to perform smoothing and filtering on the depth map of the 2D image before fusing the 2D image and the depth map of the 2D image to obtain an image with stereoscopic effect.
In one embodiment, the image having stereoscopic sensation may include: left or right view for 3D video; or, a disparity map suitable for VR video.
The processing flow of each module in the image conversion device is the same as the image conversion method flow in the above embodiment, and is not described herein again.
Example 3:
in one embodiment, the image conversion method provided by the present application may be implemented in the form of a computer program that is executable on a computer device, and a non-volatile storage medium of the computer device may store various program modules, such as an image acquisition module, an image processing module, an image conversion module, and the like. The computer program composed of the respective program modules causes the processor to execute the steps in the image conversion method of the respective embodiments of the present application described in the present specification.
Example 4:
fig. 5 is a schematic diagram of an image conversion apparatus, which includes a base 10, an HDMI (High definition multimedia Interface) Interface 20, and a USB (Universal Serial Bus) Interface 30. The base 10 supports different brands of mobile phones to be directly inserted into a hardware module, and a display terminal, such as a display screen or VR glasses, is accessed through the peripheral HDMI 20 of the hardware module.
The base 10 is used for placing digital equipment such as a mobile phone and a game machine in the base 10, and then 2D videos, games and the like on the mobile phone and the game machine are transmitted to a television, VR glasses or other large display screens through internal conversion of the base 10, and 3D effects are output. The HDMI interface 20 is responsible for transmitting 3D images to a television, VR glasses, or other large display screen with outward signal flow; the USB interface 30 is used to transmit 2D images to the cradle 10 through interfaces on the mobile phone and the game machine, and the signal flow is inward. The image conversion apparatus may further include a charging interface 40 for charging the entire system.
Fig. 6 is a diagram illustrating a hardware portion of the base 10 according to an embodiment. As shown in fig. 6: the hardware portion of the base 10 includes a core component 120 and a configuration component 100. The configuration component 100 is connected to a single chip microcomputer in the base 10 through an APB (Advanced Peripheral Bus) Bus, receives configuration information sent by the single chip microcomputer, and completes configuration of the depth map generation module 124 and controls opening and closing of the whole module. The configuration component 100 includes a transmission module 111 and a synchronization module 112. The core component 120 is mainly to acquire a depth map of the 2D image and the post-processing of the depth map. The core component 120 includes a signal conversion module 121, an image adjustment module 122, a motion estimation module 123, a depth map generation module 124, a depth map adjustment module 125, and a filter processing module 126.
In the configuration mode, the depth map generating module 124 does not work, and an external CPU (Central processing unit) can perform read and write operations on the register of the depth map generating module 124 through an APB bus. At this time, the registers inside the depth map generating module 124 may be configured, including the edge value threshold, the image sharpening threshold, the operating mode of the module, and the like, and the CPU may read the version number of the module, the configuration state of the registers, and the like.
In the run mode, the APB interface is not able to write to registers inside the depth map generation module 124. In this mode, the depth map generation module 124 performs depth map extraction on the input image and post-processing such as image enhancement on the extracted depth map. The processed depth map is then output to a subsequent module. When the configuration mode is switched to the operation mode and the operation mode is switched to the configuration mode, the mode switching can be completed only before the processing transmission of the next frame is started after the processing transmission of one frame of image is completed, so as to ensure the integrity of one frame of image.
The transmission module 111 receives the configuration information sent from the APB bus, and sends the received configuration information to the synchronization module 112. The synchronization module 112 mainly handles the synchronization problem of the two clock domains, converts the incoming configuration information from the APB clock domain to the pixel clock domain, and outputs the configuration information of the pixel clock domain to the core component 120. After the configuration of the registers of core component 120 is completed, a start signal is again issued to core component 120.
The core component 120 starts to receive the 2D image sent by the external USB after receiving the start signal, the 2D image first converts the 2D image from RGB (Red, Green, Blue) mode to YUV mode through the signal conversion module 121, then the YUV signal is sent to the image adjustment module 122 for reduction, and the image adjustment module 122 sends the intermediate data to the DDR3(double data Rate) for temporary storage through the avalon host. And the reduced data enters a motion estimation module for motion estimation to obtain a motion vector. The depth map generating module 124 calculates the depth map of the 2D image according to the motion vector and sends the depth map to the depth map adjusting module 125 for smoothing and enlarging. The processed data is sent to DDR3 by the avalon host to be stored. The stored data is read out from the DDR3 by the filtering module and is subjected to filtering processing, and the data is sent back to the DDR3 by the avalon host after being filtered.
Under ideal conditions, the camera imaging model can be approximately considered to conform to the pinhole imaging model, so that the corresponding relation between the depth of a ground point in the real world and the image coordinates of the imaged point in the camera can be deduced according to the camera pinhole imaging model. For most video images, the depth (i.e., object distance) of a real scene is much greater than the image distance imaged by the scene. Then, according to the convex lens imaging principle, it can be approximately assumed that the images of the scene are all located on the camera focal plane. When the regional depth is estimated, the position of a wire harness in the video image is analyzed and estimated by calculating vanishing points, image edge characteristics and the like in the video image, coordinates corresponding to the wire harness are obtained, and then the depth information of regional pixel points is calculated by combining with the center coordinates of an image plane. The position of a line beam in the image is estimated according to the characteristics of the vanishing point and the like of the image, the line beam is adjusted to be horizontal by rotating the image, the depth of the area in the video is calculated according to the coordinates of the pixel points of the area after rotation, and the depth estimation of the video area is completed.
The stereogram generation module 127 may fuse the depth map of the processed 2D image with the 2D image to obtain a disparity map suitable for a left view or a right view of the 3D video and for the VR video.
The specific processing procedure of each component or module in the image processing apparatus on the image is described in the above embodiments, and is not described herein again.
The image processing device in the implementation of the application divides the image into the foreground area and the background area to respectively extract the depth values, integrates the texture depth values acquired by the texture features of the image, and fuses to obtain the depth value of the whole image.
Furthermore, in the method, the 2D image can be converted into a left view or a right view corresponding to the 3D video and also can be converted into a parallax image corresponding to the VR video, so that convenience in image conversion is realized.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. The storage medium may be a magnetic disk, an optical disk, a Read-only memory (ROM), or the like.
Suitable non-volatile memory may include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory volatile memory may include Random Access Memory (RAM), which acts as external cache memory, by way of illustration and not limitation, RAM is available in a variety of forms, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (S L DRAM), Rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
Thus, it should be appreciated by those skilled in the art that while a number of exemplary embodiments of the invention have been illustrated and described in detail herein, many other variations or modifications consistent with the principles of the invention may be directly determined or derived from the disclosure of the present invention without departing from the spirit and scope of the invention. Accordingly, the scope of the invention should be understood and interpreted to cover all such other variations or modifications.

Claims (10)

1. An image conversion method, characterized by comprising the steps of:
step 100, obtaining a 2D image to be processed, and obtaining a texture depth map of the 2D image according to texture features of the 2D image;
step 200, dividing the 2D image into a foreground area and a background area according to a set requirement, and respectively obtaining a motion depth map of the foreground area and a linear depth map of the background area;
step 300, fusing the motion depth map and the texture depth map to obtain a foreground depth map, fusing the linear depth map and the texture depth map to obtain a background depth map, and fusing the foreground depth map and the background depth map to obtain a depth map of the 2D image;
and step 400, fusing the 2D image and the depth map of the 2D image to obtain an image with stereoscopic impression.
2. The image conversion method according to claim 1,
the process of dividing the 2D image into foreground and background regions according to established requirements is as follows:
step 201, the established requirements are that an area formed by a static object in the 2D image is taken as a background area, and an area formed by a moving object in the 2D image is taken as a foreground area;
step 202, preprocessing the 2D image, establishing a corresponding background image, and acquiring a pixel value of the background image;
step 203, differentiating the 2D image and the background image, and determining a foreground region according to a differential value;
and 204, determining an image except the foreground region in the 2D image as a background region.
3. The image conversion method according to claim 1,
before the step 400 is performed, a step of smoothing and filtering the depth map of the 2D image is further included.
4. The image conversion method according to claim 1,
the stereoscopic image in the step 400 includes: left or right view for 3D video; or, a disparity map suitable for VR video.
5. An image conversion apparatus characterized by comprising:
the image acquisition module is used for acquiring a 2D image to be processed;
the image processing module is used for dividing the 2D image into a foreground area and a background area according to a set requirement;
the depth map obtaining module is used for obtaining a texture depth map of the 2D image according to the texture features of the 2D image; acquiring a motion depth map of the foreground area and a linear depth map of the background area; fusing the motion depth map and the texture depth map to obtain a foreground depth map, and fusing the linear depth map and the texture depth map to obtain a background depth map; fusing the foreground depth map and the background depth map to obtain a depth map of the 2D image;
and the image conversion module is used for fusing the 2D image and the depth map of the 2D image to obtain an image with stereoscopic impression.
6. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 4.
7. The utility model provides an image conversion equipment, is equipped with base, HDMI interface and the USB interface of waiting to convert 2D image equipment including being used for pegging graft, its characterized in that:
the USB interface is used for transmitting an external 2D image to be processed to the base;
the base comprises a core component for reading 2D images, wherein the core component comprises a depth map generation module: the depth map generation module can obtain a texture depth map of the 2D image according to texture features of the 2D image, then divide the 2D image into a foreground region and a background region according to established requirements, obtain a motion depth map of the foreground region and a linear depth map of the background region, fuse the motion depth map and the texture depth map to obtain a foreground depth map, fuse the linear depth map and the texture depth map to obtain a background depth map, and fuse the foreground depth map and the background depth map to obtain a depth map of the 2D image; and
the stereo image generation module is used for fusing the 2D image and the depth image of the 2D image to obtain an image with stereoscopic sensation;
the HDMI is used for transmitting the stereoscopic images to external equipment.
8. The image conversion apparatus according to claim 7, wherein the core component further includes:
the signal conversion module is used for converting the 2D image from an RGB mode to a YUV mode;
an image adjustment module for reducing the 2D image;
the motion estimation module is used for carrying out motion estimation on the 2D image to obtain a corresponding motion vector; the motion vector is used for generating a depth map of the 2D image;
the depth map adjusting module is used for smoothing and amplifying the depth map of the 2D image;
and the filtering module is used for carrying out filtering processing on the depth map of the 2D image.
9. The image conversion apparatus according to claim 7, characterized in that:
the base further comprises a configuration assembly comprising a transport module and a synchronization module;
the transmission module is used for transmitting the received configuration information for processing the 2D image to the synchronization module;
and the synchronization module is used for converting the configuration information into a clock domain and transmitting the converted configuration information of the pixel clock domain to the core component.
10. The image conversion device of claim 7, wherein the image conversion device is operable in a configuration mode and an operational mode:
in the configuration mode, an external CPU can perform read-write operation on a register of the depth map generation module through an APB bus;
in the running mode, the external CPU can not perform read-write operation on the register of the depth map generation module; and after one frame of image is processed, the operation mode can be switched to the configuration mode.
CN202010172617.7A 2020-03-12 2020-03-12 Method and device for converting plane image into three-dimensional image, computer readable storage medium and equipment Withdrawn CN111447428A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010172617.7A CN111447428A (en) 2020-03-12 2020-03-12 Method and device for converting plane image into three-dimensional image, computer readable storage medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010172617.7A CN111447428A (en) 2020-03-12 2020-03-12 Method and device for converting plane image into three-dimensional image, computer readable storage medium and equipment

Publications (1)

Publication Number Publication Date
CN111447428A true CN111447428A (en) 2020-07-24

Family

ID=71650557

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010172617.7A Withdrawn CN111447428A (en) 2020-03-12 2020-03-12 Method and device for converting plane image into three-dimensional image, computer readable storage medium and equipment

Country Status (1)

Country Link
CN (1) CN111447428A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815666A (en) * 2020-08-10 2020-10-23 Oppo广东移动通信有限公司 Image processing method and device, computer readable storage medium and electronic device
WO2022071875A1 (en) * 2020-09-30 2022-04-07 脸萌有限公司 Method and apparatus for converting picture into video, and device and storage medium
CN115713465A (en) * 2022-10-28 2023-02-24 北京阅友科技有限公司 Three-dimensional display method and device of plane image, storage medium and terminal

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101872546A (en) * 2010-05-06 2010-10-27 复旦大学 Video-based method for rapidly detecting transit vehicles
CN103413347A (en) * 2013-07-05 2013-11-27 南京邮电大学 Extraction method of monocular image depth map based on foreground and background fusion
CN103686139A (en) * 2013-12-20 2014-03-26 华为技术有限公司 Frame image conversion method, frame video conversion method and frame video conversion device
US20160301936A1 (en) * 2011-07-22 2016-10-13 Qualcomm Incorporated Mvc based 3dvc codec supporting inside view motion prediction (ivmp) mode
CN106327500A (en) * 2016-08-31 2017-01-11 重庆大学 Depth information obtaining method and apparatus
CN110660131A (en) * 2019-09-24 2020-01-07 宁波大学 Virtual viewpoint hole filling method based on depth background modeling

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101872546A (en) * 2010-05-06 2010-10-27 复旦大学 Video-based method for rapidly detecting transit vehicles
US20160301936A1 (en) * 2011-07-22 2016-10-13 Qualcomm Incorporated Mvc based 3dvc codec supporting inside view motion prediction (ivmp) mode
CN103413347A (en) * 2013-07-05 2013-11-27 南京邮电大学 Extraction method of monocular image depth map based on foreground and background fusion
CN103686139A (en) * 2013-12-20 2014-03-26 华为技术有限公司 Frame image conversion method, frame video conversion method and frame video conversion device
CN106327500A (en) * 2016-08-31 2017-01-11 重庆大学 Depth information obtaining method and apparatus
CN110660131A (en) * 2019-09-24 2020-01-07 宁波大学 Virtual viewpoint hole filling method based on depth background modeling

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815666A (en) * 2020-08-10 2020-10-23 Oppo广东移动通信有限公司 Image processing method and device, computer readable storage medium and electronic device
CN111815666B (en) * 2020-08-10 2024-04-02 Oppo广东移动通信有限公司 Image processing method and device, computer readable storage medium and electronic equipment
WO2022071875A1 (en) * 2020-09-30 2022-04-07 脸萌有限公司 Method and apparatus for converting picture into video, and device and storage medium
US11871137B2 (en) 2020-09-30 2024-01-09 Lemon Inc. Method and apparatus for converting picture into video, and device and storage medium
CN115713465A (en) * 2022-10-28 2023-02-24 北京阅友科技有限公司 Three-dimensional display method and device of plane image, storage medium and terminal
CN115713465B (en) * 2022-10-28 2023-11-14 北京阅友科技有限公司 Stereoscopic display method and device for plane image, storage medium and terminal

Similar Documents

Publication Publication Date Title
US11877086B2 (en) Method and system for generating at least one image of a real environment
US9020241B2 (en) Image providing device, image providing method, and image providing program for providing past-experience images
TWI748949B (en) Methods for full parallax compressed light field synthesis utilizing depth information
KR102146398B1 (en) Three dimensional content producing apparatus and three dimensional content producing method thereof
CN109360235B (en) Hybrid depth estimation method based on light field data
US9014462B2 (en) Depth information generating device, depth information generating method, and stereo image converter
Schmeing et al. Faithful disocclusion filling in depth image based rendering using superpixel-based inpainting
CN109118581B (en) Image processing method and device, electronic equipment and computer readable storage medium
CN103426163A (en) System and method for rendering affected pixels
WO2017078847A1 (en) Fusion of panoramic background images using color and depth data
CN111447428A (en) Method and device for converting plane image into three-dimensional image, computer readable storage medium and equipment
KR20110093829A (en) Method and device for generating a depth map
US20140340486A1 (en) Image processing system, image processing method, and image processing program
Bleyer et al. A stereo approach that handles the matting problem via image warping
CN111047709A (en) Binocular vision naked eye 3D image generation method
CN110276831B (en) Method and device for constructing three-dimensional model, equipment and computer-readable storage medium
CN109190533B (en) Image processing method and device, electronic equipment and computer readable storage medium
CN109064533B (en) 3D roaming method and system
Jung A modified model of the just noticeable depth difference and its application to depth sensation enhancement
Liu et al. Bokeh effects based on stereo vision
Sun et al. Seamless view synthesis through texture optimization
Liu et al. Fog effect for photography using stereo vision
CN115063303A (en) Image 3D method based on image restoration
CN112053434B (en) Disparity map generation method, three-dimensional reconstruction method and related device
Criminisi et al. The SPS algorithm: Patching figural continuity and transparency by split-patch search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20200724