WO2014180255A1 - Data processing method, apparatus, computer storage medium and user terminal - Google Patents

Data processing method, apparatus, computer storage medium and user terminal Download PDF

Info

Publication number
WO2014180255A1
WO2014180255A1 PCT/CN2014/075991 CN2014075991W WO2014180255A1 WO 2014180255 A1 WO2014180255 A1 WO 2014180255A1 CN 2014075991 W CN2014075991 W CN 2014075991W WO 2014180255 A1 WO2014180255 A1 WO 2014180255A1
Authority
WO
WIPO (PCT)
Prior art keywords
mapping model
pixel coordinate
coordinate mapping
image data
right views
Prior art date
Application number
PCT/CN2014/075991
Other languages
French (fr)
Chinese (zh)
Inventor
李飞
王云飞
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2014180255A1 publication Critical patent/WO2014180255A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20164Salient point detection; Corner detection

Definitions

  • the present invention belongs to the field of multimedia application technologies, and in particular, to a data processing method, device, computer storage medium, and user terminal. Background technique
  • SIFT is a local feature descriptor involved in the SIFT local feature description algorithm. SIFT features are unique, rich in information, and highly invariant to most image transformations. In Mikolajczyk's invariance comparison experiments on ten local descriptors including SIFT operators, SIFT and its extension algorithms have been proven to have the strongest robustness in the same descriptors. The so-called robustness is Refers to a description of stability.
  • the SIFT algorithm consists of two parts: a scale-invariant region of interest detector and a feature descriptor based on the gray-scale step distribution of the region of interest. Its main features are as follows:
  • the SIFT feature is a local feature of the image that remains invariant to rotation, scale scaling, and brightness variations, and maintains a degree of stability to viewing angle changes, affine transformations, and noise.
  • the SIFT feature matching algorithm mainly includes two stages.
  • the first stage is the generation of SIFT features, that is, the feature vectors that are independent of scale scaling, rotation, and brightness change are extracted from multiple images; the second stage is the matching of SIFT feature vectors.
  • SIFT algorithms have been widely used in target recognition, image restoration, image stitching, etc. area.
  • RANSAC random sample consensus
  • the collection device performs stereoscopic imaging using the image data of the captured image, and the synthesis of the stereoscopic image includes preparation of stereo data, sub-pixel determination criteria, pixel samples of each viewpoint, and various viewpoints. Sub-pixel arrangement synthesis, compressed transmission of stereo images and display of several parts.
  • the embodiments of the present invention are intended to provide a data processing method, apparatus, and user terminal, which at least solve the problem that stable stereoscopic image data cannot be obtained, and can optimize and reconstruct the obtained stereoscopic image data.
  • the embodiment of the invention provides a data processing method, including:
  • the left of each frame of stereo image data The view and the right view respectively extract feature values and perform matching, and obtain a pixel coordinate mapping model between the left and right views according to the matching result;
  • Each frame of stereoscopic image data corresponds to a pixel coordinate mapping model between left and right views, and an average pixel coordinate mapping model is obtained according to a pixel coordinate mapping model between left and right views corresponding to the multi-frame stereoscopic image data;
  • Image processing is performed based on the average pixel coordinate mapping model.
  • the left and right views of the stereoscopic image data of each frame are respectively extracted and matched, and specifically include:
  • the SIFT feature descriptors corresponding to the left view and the right view are respectively matched to the SIFT feature points, and the matching point pairs of the SIFT feature points of the left and right views are obtained.
  • the pixel coordinate mapping model between the left and right views is obtained according to the matching result, and specifically includes:
  • a pixel coordinate mapping model between the left and right views is obtained with the pair of matching points as input parameters.
  • the pixel coordinate mapping model between the left and right views is obtained by using the matching point pair as an input parameter, and specifically includes:
  • the coordinate mapping model obtains an average pixel coordinate mapping model, which specifically includes:
  • the image processing is performed based on the average pixel coordinate mapping model, and the specific method includes:
  • Noise reduction processing is performed based on the average pixel coordinate mapping model.
  • the repairing the damaged area based on the average pixel coordinate mapping model includes:
  • Detecting the damaged area determining coordinate information of the damaged area in another normal view based on the average pixel coordinate mapping model, replacing the image content of the current damaged area with the image content of the corresponding area in the normal view, and detecting the edge of the damaged area , Corrected by the average of the gray values of the corresponding pixel points in the left and right views.
  • the performing noise reduction processing based on the average pixel coordinate mapping model includes:
  • Detecting a suspected noise point determining a location area of the suspected noise point in another view based on the average pixel coordinate mapping model, performing gray scale comparison to determine whether the suspected noise point is a noise point; and determining the determined noise point according to the predetermined position area
  • the gray value of each pixel in the neighborhood is corrected by the gray value of the pixel corresponding to the other view.
  • the embodiment of the invention further provides a data processing device, including:
  • the pre-processing unit is configured to extract and match the feature values of the left view and the right view of each frame of the stereo image data when the pre-processing of each frame of the stereo image data is performed, and obtain the pixels between the left and right views according to the matching result.
  • Coordinate mapping model
  • An image processing unit configured to perform image processing based on the average pixel coordinate mapping model;
  • Each frame of stereoscopic image data corresponds to a pixel coordinate mapping model between left and right views, and the average pixel coordinate mapping model is obtained according to a pixel coordinate mapping model between left and right views corresponding to the multi-frame stereoscopic image data.
  • the preprocessing unit further includes: a feature matching subunit
  • the feature matching sub-unit is configured to extract SIFT features respectively for the left view and the right view, and obtain SIFT feature descriptors corresponding to the left view and the right view respectively; and the SIFT feature descriptors corresponding to the left view and the right view respectively perform SIFT features Point matching, obtaining matching point pairs of SIFT feature points of the left and right views.
  • the preprocessing unit further includes: a model estimation subunit;
  • the model estimation subunit is configured to obtain the pixel coordinate mapping model between the left and right views by using the matching point pair as an input parameter.
  • the model estimation subunit is configured to randomly select a data point set from the set S of the matching point pairs for initializing; and filter out the matching support points according to the preset threshold from the data point set. Collecting Si as a uniform set; continuously selecting new data samples and estimating a pixel coordinate mapping model between left and right views according to the comparison of the size of Si with a preset threshold until the largest uniform set is obtained, according to the largest uniform set For the final data sample, get the desired pixel coordinate mapping model between the left and right views.
  • the image processing unit further includes: a model mean value obtaining sub-unit; the model mean value obtaining sub-unit configured to acquire a corresponding number of multi-frame stereo image data corresponding to the latest set of stereo images before synthesis
  • the pixel coordinate mapping model between the views averages the pixel coordinate mapping model between the left and right views corresponding to the multi-frame stereo image data to obtain an average pixel coordinate mapping model.
  • the image processing unit further includes: a first processing subunit and a second processing subunit;
  • the first processing subunit is configured to break based on the average pixel coordinate mapping model Repair processing of the damaged area
  • a second processing sub-unit configured to perform noise reduction processing based on the average pixel coordinate mapping model.
  • the first processing sub-unit is further configured to detect a damaged area, determine coordinate information of the damaged area in another normal view based on the average pixel coordinate mapping model, and use image content of the corresponding area in the normal view.
  • the image content of the current damaged area is replaced, and the edge of the damaged area is corrected by the average of the gray values of the corresponding pixel points in the left and right views.
  • the second processing sub-unit is further configured to detect a suspected noise point, determine a location area of the suspected noise point in another view based on the average pixel coordinate mapping model, and perform gray scale comparison to determine a suspected noise point. Whether it is a noise point; the determined noise point is corrected according to the gray value of each pixel in the neighborhood of the predetermined position area by the gray value of the pixel corresponding to the other view.
  • the preprocessing unit, the image processing unit, the feature matching subunit, the model estimation subunit, the model mean acquisition subunit, the first processing subunit, and the second processing subunit may use a central processing unit when performing processing (CPU, Central Processing Unit), digital signal processor (DSP, Digital Singnal Processor) or programmable logic array (FPGA, Field-Programmable Gate Array) implementation.
  • CPU Central Processing Unit
  • DSP Digital Singnal Processor
  • FPGA Field-Programmable Gate Array
  • the embodiment of the present invention further provides a computer storage medium, the computer storage medium comprising a set of instructions, when executed, causing at least one processor to execute the data processing method according to any one of the above aspects .
  • An embodiment of the present invention further provides a user terminal, where the user terminal includes the data processing device as described above.
  • the method of the present invention includes: when preprocessing each frame of stereoscopic image data, extracting feature values and matching the left and right views of each frame of stereoscopic image data respectively, and obtaining pixels between the left and right views according to the matching result.
  • Coordinate mapping model each frame of stereoscopic image data corresponds to a pixel coordinate mapping model between left and right views, and corresponding left and right views according to multi-frame stereo image data
  • the pixel coordinate mapping model between the images obtains an average pixel coordinate mapping model; image processing is performed based on the average pixel coordinate mapping model.
  • the model since the stereo image data of each frame of the image is preprocessed to obtain a pixel coordinate mapping model between the left and right views, the model is optimized to obtain an average pixel coordinate mapping model, and finally based on the average pixel coordinate.
  • the mapping model performs image processing, and can optimize and reconstruct the obtained stereo image data.
  • FIG. 2 is a schematic structural diagram of a basic composition of an apparatus according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of implementing SIFT feature extraction according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of an implementation of estimating a coordinate mapping model according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of implementing damage repair according to an embodiment of the present invention. detailed description
  • the application scenario of the data processing method in the embodiment of the present invention is a user terminal, in particular, a multimedia application technology field of a mobile terminal, for example, a data processing scheme for optimizing and reconstructing two-point stereoscopic image data in a two-view stereoscopic image collection process.
  • a multimedia application technology field of a mobile terminal for example, a data processing scheme for optimizing and reconstructing two-point stereoscopic image data in a two-view stereoscopic image collection process.
  • the embodiment of the present invention is mainly based on the SIFT algorithm and the RANSAC algorithm.
  • the RANSAC algorithm is used to establish the left and right view pixel mapping model by using the SIFT matching points of the left and right views, and the damaged view is repaired and reconstructed according to the left and right view pixel mapping model.
  • Providing qualified stereoscopic image data for the synthesis algorithm, using the embodiment of the present invention, and finally obtaining the modified stereo Body image data, using the left and right view pixel mapping model is more accurate for repairing damaged images, the calculation amount is small, and the repair effect is good.
  • the SIFT algorithm has been introduced before, and the basic idea of the RANSAC algorithm is:
  • This search engine iteratively rejects input data (Outliers) that are inconsistent with the estimated parameters and then uses the correct input data to estimate the parameters.
  • a specific implementation of the RANSAC algorithm is used to obtain a left-right view pixel mapping model.
  • Stereoscopic imaging is based on the principle of creating a stereoscopic parallax.
  • the principle of creating a stereoscopic parallax is that the two eyes of a person view the world from different angles, that is, there is a slight difference between the object seen by the left eye and the same object seen by the right eye.
  • the average distance between the eyes of the person is about 65mm, so the way of describing the outline of the scene is not the same.
  • the brain performs comprehensive processing (physical fusion) based on these two nuanced scenes, producing accurate three-dimensional object perception and the positioning of the object in the scene, which is a deep three-dimensional sense.
  • the work of the stereo imaging system is to generate at least two images for each scene, one for the image seen by the left eye and the other for the image seen by the right eye.
  • the two associated images are called stereo images. Pair (stereo pair).
  • the stereo display system must have the left eye only see the left image and the right eye only see the right image.
  • the display method for the embodiment of the present invention is a two-viewpoint free-stereoscopic display method, that is, two images of left and right viewpoints, and by rearranging and combining the sub-pixel points, a stereoscopic image is generated and sent to the display device.
  • the emission direction of each pixel light is controlled so that the image of the left viewpoint is only incident on the left eye, and the image of the right viewpoint is only incident on the right eye, using binocular parallax, Produce stereo vision.
  • the stereoscopic image synthesis work includes the preparation of the stereoscopic data, the sub-pixel determination criterion, the pixel of each view pixel, the synthesis of the sub-pixels of each view, the compression transmission and display of the stereo image, and the embodiment of the present invention is mainly directed to the stereo image data.
  • Preparation is the number of stereo images obtained According to the optimized reconstruction, after the stereoscopic image data is optimized and reconstructed by the embodiment of the present invention, even if the user terminal, especially the mobile terminal collects stereoscopic image data, the stereoscopic image data generated due to the occlusion of the camera and the noise is not available. It is possible to finally synthesize a stereoscopic picture of relatively high quality.
  • the data processing method of the embodiment of the present invention includes:
  • Step 101 When preprocessing each frame of stereoscopic image data, extracting feature values and matching the left and right views of each frame of stereoscopic image data respectively, and obtaining a pixel coordinate mapping model between the left and right views according to the matching result. ;
  • the stereoscopic image data may also be a stereoscopic image material, and will not be described again.
  • Step 102 Each frame of stereoscopic image data corresponds to a pixel coordinate mapping model between left and right views, and an average pixel coordinate mapping model is obtained according to a pixel coordinate mapping model between left and right views corresponding to the multi-frame stereo image data;
  • Step 103 Perform image processing based on the average pixel coordinate mapping model.
  • the data processing apparatus of the embodiment of the present invention includes:
  • the pre-processing unit is configured to extract and match the feature values of the left view and the right view of each frame of the stereo image data when the pre-processing of each frame of the stereo image data is performed, and obtain the pixels between the left and right views according to the matching result. Coordinate mapping model.
  • the image processing unit is configured to perform image processing based on the average pixel coordinate mapping model; each frame of stereo image data corresponds to a pixel coordinate mapping model between left and right views, and pixel coordinates between left and right views corresponding to the multi-frame stereo image data
  • the mapping model obtains the average pixel coordinate mapping model.
  • a computer storage medium comprising a set of instructions, when executed, causing at least one processor to execute the data processing method.
  • the user terminal of the embodiment of the present invention includes the basic structure of the data processing apparatus of the embodiment of the present invention, and various modifications and equivalents thereof are omitted.
  • the following is an application scenario of the embodiment of the present invention (the scene when the user terminal takes a photo of the mobile phone):
  • the data processing method in the embodiment of the present invention is specifically a two-view eye stereo image data optimization reconstruction processing scheme.
  • the user uses the mobile phone to take a stereoscopic image
  • the user needs to preview first, and the user presses the shutter button.
  • the first end time is generally in a relatively stable and reliable stereo data collection state, and the process of pressing the shutter is prone to sudden noise.
  • the embodiment of the present invention needs to perform pre-processing before photographing, so as to obtain stable stereoscopic image data, extract SIFT features from the left and right views of the preview image, and perform matching to determine the left and right views of the current scene.
  • the pixel coordinate mapping model according to this mapping model, we perform occlusion repair and denoising on single-view low-quality stereo data to generate reliable dual-view stereo data.
  • the first step to the sixth step are implemented in the image preview process (ie, the process of the above preprocessing), and the seventh step to the eleventh step are steps implemented after the shutter button is pressed to take a picture.
  • the stereo image is obtained based on the optimized and reconstructed stereo image data obtained by the preprocessing process, and the final result is obtained after the damaged image is repaired and denoised.
  • Step 1 Pre-preview the left and right view images during preview, including scale scaling and image smoothing. Since the phone preview mode is different from the camera mode setting, the image captured in the preview mode is first scaled, such as 5Mp. Since the SIFT feature description algorithm has scale invariance, it does not affect SIFT feature extraction and matching.
  • the Gaussian low-pass filter is applied to image smoothing because the Gaussian low-pass filter can effectively overcome the ringing effect and has an obvious effect of eliminating noise.
  • the beneficial effects are as follows: The image smoothing process is performed by using a Gaussian low-pass filter, mainly to reduce the influence of scale scaling on the picture quality, so as to ensure the reliability of the coordinate mapping model.
  • Step 2 Extract the SIFT features from the processed left and right view images.
  • Step 111 Prepare an SIFT feature for extracting an image
  • Step 112 Construct a scale space
  • Step 113 detecting a spatial extreme point
  • Step 114 accurately locate the spatial extreme point
  • Step 115 Removing an edge response point
  • Step 116 Generate a SIFT feature descriptor.
  • the SIFT feature descriptor is obtained from the operation steps (steps 111-116) of the left view, and the SIFT feature descriptor is obtained by the operation steps (steps 121-126) of the right view, and step 13 is performed.
  • Step 13 finally performing SIFT feature point matching by using the SIFT feature descriptor obtained by the operation steps (steps 111-116) of the left view and the SIFT feature descriptor obtained by the operation steps of the right view (steps 121-126).
  • the SIFT feature point matching point pairs of the left and right views which are simply referred to as matching point pairs.
  • (a) Detecting scale space extreme points, including the generation of scale space, and the detection of spatial extreme points.
  • SIFT uses a Gaussian difference (DoG) scale space, which is formed by convolving a Gaussian differential kernel of adjacent dimensions with an input image.
  • the DoG core is not only a linear approximation of LoG, but also greatly simplifies the calculation of scale space.
  • the extreme point is searched in the neighborhood of its image space and DoG scale space, and the position of the feature point is initially obtained.
  • the middle to-be-detected point is compared with its neighboring fields of 8 adjacent points of the same scale and 9 X 2 points corresponding to the upper and lower adjacent scales to ensure detection in both the scale space and the two-dimensional image space.
  • To the extreme point If a point is the largest or smallest value in the DOG scale space layer and the 26 fields in the upper and lower layers, the point is considered to be a feature point of the image at the scale.
  • each seed point in order to enhance the robustness of the matching, a total of 16 seed points of 4 X 4 are used for each key point to describe, and each seed point has 8 direction vector information, so that for a key point, 128 data is generated, which ultimately forms a 128-dimensional SIFT feature vector.
  • Step 3 Match the SIFT feature points of the left and right views.
  • the Euclidean distance of the key feature vector is used as the similarity determination metric of the key points in the two images.
  • the formula takes a key point in the left view and finds it closest to the Euclidean distance in Image 2.
  • Two key points match according to the distance-ratio criterion, that is, in these two key points, if the nearest distance is divided by the nearest distance ⁇ 2 to get mtio, if rat.
  • the proportional threshold can be 1.5 according to the experimental result.
  • the ratio threshold is 1.5 to ensure that the number of matching points is sufficient, and the amount of calculation can be effectively controlled.
  • the number of matching points decreases as the threshold increases, and the ratio threshold is lowered, SIFT matching. The number of points will decrease, but it will be more stable.
  • the definition is as shown in Equation 2.
  • ra i As the value increases, the number of SIFT matching points will decrease, but the accuracy will increase. Since the accuracy of the SIFT matching point pair is high in the embodiment of the present invention, and the difference between the left and right views is small, the number of matching points is too high, so that the calculation amount is too large. Therefore, the value in this example should be increased appropriately.
  • ra ii The value is 1.8.
  • Step 4 Estimate the pixel coordinate mapping model between the left and right views of the current scene using the RANSAC algorithm.
  • FIG. 4 the flow of the pixel coordinate mapping model between the left and right views is estimated as shown in FIG. 4, which includes:
  • Step 201 Randomly select 8 sets of matching point pairs for initializing the pixel coordinate mapping model between the left and right views.
  • Step 202 Find a set of support points of the current model.
  • Step 203 Determine whether the size of the support point set meets a threshold. If yes, execute step 204. Otherwise, perform step 201.
  • the threshold may be 67% of the entire data point set according to the experimental result, and the threshold is 67% of the entire data point set to ensure the validity of the model.
  • Step 204 Estimate a pixel coordinate mapping model between the left and right views.
  • the flow of Fig. 4 is to estimate the left and right view pixel mapping model by using the coordinate information of the matching point pairs in the left and right views as the input parameters of the RANSAC model estimation system.
  • the largest uniform set Si was selected and used to re-estimate the model to get the final result.
  • the process of model estimation in this step is actually the process of solving F. Since there are enough matching points, a reliable coordinate mapping model can be generated.
  • Step 5 Only the correct matching point pairs that match the pixel coordinate mapping model between the left and right views in step 4, that is, the point set in Si, are saved together with the obtained left and right view pixel coordinate mapping models as the frame image reference information.
  • Step 6 The first step to the fifth step of the infinite loop, only retain the image reference information of the latest four frames of images, and establish a queue.
  • Each frame collects one frame of stereo image data, deletes the frame header information, and stores the latest frame reference information at the end of the queue.
  • Step 7 Take a picture, and smooth the left and right views, and solve the average pixel coordinate mapping model with the first three frames of the pixel coordinate mapping model between the current left and right views. After taking a photo, the user selects whether the view needs to be repaired, and if necessary, continues.
  • Step 8 Verify that it is occluded.
  • the average of the left and right views is divided into 8 blocks, and the average gray value is taken for each block, and the average gray value of the corresponding block in the left and right views is compared. If the relative difference of the average gray value of the block is more than 10%, then The low gray value block is regarded as a damaged block. If the number of broken blocks is 0, then jump to the tenth step.
  • Step 9 Repair the occlusion area.
  • Step 301 accurately detecting the damaged area (sobel operator).
  • Step 302 Calculate coordinate information of the damaged area in a normal view.
  • Step 303 Repair the damaged area.
  • Step 304 repair the edge.
  • Step 10 Noise point detection.
  • the left and right views may be accompanied by a certain amount of salt and pepper noise.
  • the median filter is used to detect the noise of the left and right views. , and mark the noise points.
  • the maximum value, the minimum value and the mean value of the gradation are taken, if the gray value of the current point is the maximum or minimum value within the region, and exceeds the set threshold ( If the average gray value in the neighborhood is 60%-150% is the basic threshold, and outside the range is the set threshold, it may be noise and marked as suspicious.
  • the coordinate mapping model is used to determine the position area of the suspicious point in another view, the current point is placed in this position, and the gray level comparison is performed again to determine whether the current point is a noise point.
  • Step 11 Fix the noise point.
  • the gray value of each pixel in the neighborhood of the noise point 3 x 3 confirmed in the tenth step is corrected by the gray value of the pixel corresponding to the other viewpoint.
  • Step 12 Submit the optimized stereo image data to the synthesis algorithm, and use the existing synthesis algorithm to synthesize the stereo image.
  • the integrated modules described in the embodiments of the present invention may also be stored in a computer readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product.
  • the computer software product is stored in a storage medium and includes a plurality of instructions. Enabling a computer device (which may be a personal computer, server, or network device, etc.) to perform all of the methods described in various embodiments of the present invention or section.
  • the foregoing storage medium includes: a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like.
  • ROM read-only memory
  • RAM random access memory
  • magnetic disk or an optical disk and the like.
  • the model since the stereo image data of each frame of the image is preprocessed to obtain a pixel coordinate mapping model between the left and right views, the model is optimized to obtain an average pixel coordinate mapping model, and finally based on the average pixel coordinate.
  • the mapping model performs image processing, and can optimize and reconstruct the obtained stereo image data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

Disclosed are a data processing method, apparatus, computer storage medium and user terminal. The method comprises: preprocessing every frame of the acquired stereoscopic image data; extracting feature values from the left and right views of every frame of the stereoscopic image data respectively and matching the feature values, and obtaining a pixel coordinates mapping model between the left view and the right view according to the matching result; every frame of the stereoscopic image data being corresponding to one pixel coordinates mapping model between the left view and the right view, and obtaining an average pixel coordinates mapping model according to the pixel coordinates mapping models between the left view and the right view, the pixel coordinates mapping models corresponding to multiple frames of stereoscopic image data; performing image processing on the basis of the average pixel coordinates mapping model. Introduction of the present invention at least solves the problem that stable stereoscopic image data is unobtainable, and can also accomplish optimization and reconstruction of the obtained stereoscopic image data.

Description

一种数据处理方法、 装置、 计算机存储介质及用户终端 技术领域  Data processing method, device, computer storage medium and user terminal
本发明属于多媒体应用技术领域, 尤其涉及一种数据处理方法、 装置、 计算机存储介质及用户终端。 背景技术  The present invention belongs to the field of multimedia application technologies, and in particular, to a data processing method, device, computer storage medium, and user terminal. Background technique
SIFT是 SIFT局部特征描述算法中涉及的局部特征描述子, SIFT特征 独特性好, 信息量丰富, 并且对大多数图像变换具有很强的不变性。 在 Mikolajczyk对包括 SIFT算子在内的十种局部描述子所^的不变性对比实 验中, SIFT及其扩展算法已被证实在同类描述子中具有最强的鲁棒性, 所 谓鲁棒性是指稳定性的一种描述。  SIFT is a local feature descriptor involved in the SIFT local feature description algorithm. SIFT features are unique, rich in information, and highly invariant to most image transformations. In Mikolajczyk's invariance comparison experiments on ten local descriptors including SIFT operators, SIFT and its extension algorithms have been proven to have the strongest robustness in the same descriptors. The so-called robustness is Refers to a description of stability.
SIFT算法包含了两个部分: 一个尺度不变的兴趣区域检测器和一个基 于兴趣区域的灰度阶梯分布的特征描述符。 其主要特点如下:  The SIFT algorithm consists of two parts: a scale-invariant region of interest detector and a feature descriptor based on the gray-scale step distribution of the region of interest. Its main features are as follows:
a) SIFT特征是图像的局部特征, 其对旋转、 尺度缩放、 亮度变化保持 不变性, 对视角变化、 仿射变换、 噪声也保持一定程度的稳定性。  a) The SIFT feature is a local feature of the image that remains invariant to rotation, scale scaling, and brightness variations, and maintains a degree of stability to viewing angle changes, affine transformations, and noise.
b) 独特性 (Distinctiveness)好, 信息量丰富, 适用于在海量特征数据库 中进行快速、 准确的匹配。  b) Distinctiveness and rich information, suitable for fast and accurate matching in the massive feature database.
c) 多量性, 即使少数的几个物体也可以产生大量 SIFT特征向量。 d) 高速性, 经优化的 SIFT匹配算法甚至可以达到实时的要求。  c) Multiplicity, even a small number of objects can produce a large number of SIFT feature vectors. d) High speed, optimized SIFT matching algorithms can even meet real-time requirements.
e) 可扩展性, 可以很方便的与其他形式的特征向量进行联合。  e) Scalability, which can be easily combined with other forms of feature vectors.
SIFT特征匹配算法主要包括两个阶段, 第一阶段是 SIFT特征的生成, 即从多幅图像中提取对尺度缩放、 旋转、 亮度变化无关的特征向量; 第二 阶段是 SIFT特征向量的匹配。  The SIFT feature matching algorithm mainly includes two stages. The first stage is the generation of SIFT features, that is, the feature vectors that are independent of scale scaling, rotation, and brightness change are extracted from multiple images; the second stage is the matching of SIFT feature vectors.
如今, SIFT算法已经广泛应用于目标识别、 图像复原、 图像拼接等领 域。 Today, SIFT algorithms have been widely used in target recognition, image restoration, image stitching, etc. area.
随机取样一致 ')·生算法( RANSAC, random sample consensus )是由 Fishier 和 Bolles提出的一种鲁棒性估计方法。如今, RANSAC技术已经成为线性、 非线性模型估计重要方法。  RANSAC (random sample consensus) is a robust estimation method proposed by Fishier and Bolles. Today, RANSAC technology has become an important method for estimating linear and nonlinear models.
本申请发明人在实现本申请实施例技术方案的过程中, 至少发现现有 技术中存在如下技术问题:  In the process of implementing the technical solutions of the embodiments of the present application, at least the following technical problems exist in the prior art:
在用户终端, 例如移动终端的多媒体应用领域, 釆集设备利用釆集的 图像数据进行立体成像, 立体图像的合成工作包括立体数据的准备、 子像 素判断准则、 各视点像素子釆样、 各视点子像素排列合成、 立体图像的压 缩传输与显示几个部分。  In a multimedia application field of a user terminal, such as a mobile terminal, the collection device performs stereoscopic imaging using the image data of the captured image, and the synthesis of the stereoscopic image includes preparation of stereo data, sub-pixel determination criteria, pixel samples of each viewpoint, and various viewpoints. Sub-pixel arrangement synthesis, compressed transmission of stereo images and display of several parts.
在两视点立体图像釆集合成过程中, 由于每一帧立体数据只有左右视 点两张图片, 所以合成算法容错率较低。 在使用移动终端釆集双视点立体 图像数据时, 由于用户操作不稳定的问题, 容易出现手指遮挡等问题, 当 釆集设备出现噪声较大, 摄像头遮挡等问题的时候, 容易造成立体图像数 据的不可用, 稳定性差, 从而最终合成错误或者质量较差的立体图片。 可 见, 在整个立体图像釆集合成过程中, 作为数据源的立体图像数据的稳定 性对最终合成立体图片起着至关重要的作用, 然而, 对于如何得到稳定的 立体图像数据尚未存在有效的解决方案。 发明内容  In the process of combining two-view stereoscopic images, since the stereo data of each frame has only two pictures of the left and right views, the error rate of the synthesis algorithm is low. When using a mobile terminal to collect dual-view stereoscopic image data, problems such as finger occlusion are prone to problems due to unstable operation of the user. When the noise of the collection device is large, the camera is blocked, and the like, the stereoscopic image data is easily caused. Not available, poor stability, resulting in a synthetic or poor quality stereo picture. It can be seen that the stability of the stereoscopic image data as the data source plays an important role in the final synthesis of the stereoscopic image in the process of assembling the entire stereoscopic image. However, there is no effective solution to how to obtain stable stereoscopic image data. Program. Summary of the invention
本发明实施例希望提供一种数据处理方法、 装置及用户终端, 至少解 决了不能得到稳定的立体图像数据的问题, 能对得到的立体图像数据进行 优化重建。  The embodiments of the present invention are intended to provide a data processing method, apparatus, and user terminal, which at least solve the problem that stable stereoscopic image data cannot be obtained, and can optimize and reconstruct the obtained stereoscopic image data.
本发明实施例的技术方案是这样实现的:  The technical solution of the embodiment of the present invention is implemented as follows:
本发明实施例提供了一种数据处理方法, 包括:  The embodiment of the invention provides a data processing method, including:
对釆集的每帧立体图像数据进行预处理时, 对每帧立体图像数据的左 视图和右视图分别提取特征值并进行匹配, 根据匹配结果得到左右视图之 间的像素坐标映射模型; When preprocessing each frame of stereo image data, the left of each frame of stereo image data The view and the right view respectively extract feature values and perform matching, and obtain a pixel coordinate mapping model between the left and right views according to the matching result;
每帧立体图像数据对应一个左右视图之间的像素坐标映射模型, 根据 多帧立体图像数据对应的左右视图之间的像素坐标映射模型得到平均像素 坐标映射模型;  Each frame of stereoscopic image data corresponds to a pixel coordinate mapping model between left and right views, and an average pixel coordinate mapping model is obtained according to a pixel coordinate mapping model between left and right views corresponding to the multi-frame stereoscopic image data;
基于所述平均像素坐标映射模型进行图像处理。  Image processing is performed based on the average pixel coordinate mapping model.
上述方案中, 所述对每帧立体图像数据的左视图和右视图分别提取特 征值并进行匹配, 具体包括:  In the above solution, the left and right views of the stereoscopic image data of each frame are respectively extracted and matched, and specifically include:
对左视图和右视图分别提取 SIFT特征, 获得左视图和右视图分别对应 的 SIFT特征描述子;  Extracting SIFT features for the left view and the right view respectively, and obtaining SIFT feature descriptors corresponding to the left view and the right view, respectively;
将左视图和右视图分别对应的 SIFT特征描述子进行 SIFT特征点匹配, 获得左右视图的 SIFT特征点的匹配点对。  The SIFT feature descriptors corresponding to the left view and the right view are respectively matched to the SIFT feature points, and the matching point pairs of the SIFT feature points of the left and right views are obtained.
上述方案中, 所述根据匹配结果得到左右视图之间的像素坐标映射模 型, 具体包括:  In the above solution, the pixel coordinate mapping model between the left and right views is obtained according to the matching result, and specifically includes:
以所述匹配点对作为输入参数来得到所述左右视图之间的像素坐标映 射模型。  A pixel coordinate mapping model between the left and right views is obtained with the pair of matching points as input parameters.
上述方案中, 所述以所述匹配点对作为输入参数来得到所述左右视图 之间的像素坐标映射模型, 具体包括:  In the above solution, the pixel coordinate mapping model between the left and right views is obtained by using the matching point pair as an input parameter, and specifically includes:
从由所述匹配点对构成的集合 S中随机选取一个数据点集进行初始化; 从所述数据点集中按照预设阈值过滤出符合的支撑点集 Si, 作为一致 根据 Si的大小与预设阈值的比对来不断选取新的数据样本及估计左右 视图之间的像素坐标映射模型直至获得最大的一致集, 根据所述最大的一 致集为最终数据样本得到所需的左右视图之间的像素坐标映射模型。  Initially selecting a data point set from the set S of the matching point pairs for initializing; filtering the matching support point set Si from the data point set according to a preset threshold, as a consistent according to the size of Si and a preset threshold Alignment to continuously select new data samples and estimate the pixel coordinate mapping model between the left and right views until the largest uniform set is obtained, and the pixel coordinates between the desired left and right views are obtained according to the largest uniform set as the final data sample. Mapping model.
上述方案中, 所述根据多帧立体图像数据对应的左右视图之间的像素 坐标映射模型得到平均像素坐标映射模型, 具体包括: In the above solution, the pixels between the left and right views corresponding to the multi-frame stereo image data are The coordinate mapping model obtains an average pixel coordinate mapping model, which specifically includes:
获取立体图像合成前最新釆集的指定个数的多帧立体图像数据对应的 左右视图之间的像素坐标映射模型, 对所述多帧立体图像数据对应的左右 视图之间的像素坐标映射模型取平均, 得到平均像素坐标映射模型。  Obtaining a pixel coordinate mapping model between the left and right views corresponding to the specified number of multi-frame stereo image data before the stereo image synthesis, and taking the pixel coordinate mapping model between the left and right views corresponding to the multi-frame stereo image data On average, an average pixel coordinate mapping model is obtained.
上述方案中, 所述基于所述平均像素坐标映射模型进行图像处理, 具 体包括:  In the above solution, the image processing is performed based on the average pixel coordinate mapping model, and the specific method includes:
基于所述平均像素坐标映射模型进行破损区域的修复处理;  Performing a repair process of the damaged area based on the average pixel coordinate mapping model;
基于所述平均像素坐标映射模型进行降噪处理。  Noise reduction processing is performed based on the average pixel coordinate mapping model.
上述方案中, 所述基于所述平均像素坐标映射模型进行破损区域的修 复处理, 具体包括:  In the above solution, the repairing the damaged area based on the average pixel coordinate mapping model includes:
检测到破损区域, 基于所述平均像素坐标映射模型确定破损区域在另 一正常视图中的坐标信息, 用正常视图中相应区域的图像内容替换当前破 损区域的图像内容, 对检测到破损区域的边缘, 用左右两视图对应像素点 灰度值的平均数修正。  Detecting the damaged area, determining coordinate information of the damaged area in another normal view based on the average pixel coordinate mapping model, replacing the image content of the current damaged area with the image content of the corresponding area in the normal view, and detecting the edge of the damaged area , Corrected by the average of the gray values of the corresponding pixel points in the left and right views.
上述方案中, 所述基于所述平均像素坐标映射模型进行降噪处理, 具 体包括:  In the above solution, the performing noise reduction processing based on the average pixel coordinate mapping model includes:
检测到可疑噪声点, 基于所述平均像素坐标映射模型确定可疑噪声点 在另一视图中的位置区域, 进行灰度比较确定可疑噪声点是否为噪声点; 将确定的噪声点按照预定位置区域的邻域内各像素灰度值用另一视图 对应像素点的灰度值进行修正。  Detecting a suspected noise point, determining a location area of the suspected noise point in another view based on the average pixel coordinate mapping model, performing gray scale comparison to determine whether the suspected noise point is a noise point; and determining the determined noise point according to the predetermined position area The gray value of each pixel in the neighborhood is corrected by the gray value of the pixel corresponding to the other view.
本发明实施例还提供了一种数据处理装置, 包括:  The embodiment of the invention further provides a data processing device, including:
预处理单元, 配置为对釆集的每帧立体图像数据进行预处理时, 对每 帧立体图像数据的左视图和右视图分别提取特征值并进行匹配, 根据匹配 结果得到左右视图之间的像素坐标映射模型;  The pre-processing unit is configured to extract and match the feature values of the left view and the right view of each frame of the stereo image data when the pre-processing of each frame of the stereo image data is performed, and obtain the pixels between the left and right views according to the matching result. Coordinate mapping model;
图像处理单元, 配置为基于所述平均像素坐标映射模型进行图像处理; 每帧立体图像数据对应一个左右视图之间的像素坐标映射模型, 根据多帧 立体图像数据对应的左右视图之间的像素坐标映射模型得到所述平均像素 坐标映射模型。 An image processing unit configured to perform image processing based on the average pixel coordinate mapping model; Each frame of stereoscopic image data corresponds to a pixel coordinate mapping model between left and right views, and the average pixel coordinate mapping model is obtained according to a pixel coordinate mapping model between left and right views corresponding to the multi-frame stereoscopic image data.
上述方案中, 所述预处理单元, 还包括: 特征匹配子单元;  In the above solution, the preprocessing unit further includes: a feature matching subunit;
所述特征匹配子单元, 配置为对左视图和右视图分别提取 SIFT特征, 获得左视图和右视图分别对应的 SIFT特征描述子; 将左视图和右视图分别 对应的 SIFT特征描述子进行 SIFT特征点匹配,获得左右视图的 SIFT特征 点的匹配点对。  The feature matching sub-unit is configured to extract SIFT features respectively for the left view and the right view, and obtain SIFT feature descriptors corresponding to the left view and the right view respectively; and the SIFT feature descriptors corresponding to the left view and the right view respectively perform SIFT features Point matching, obtaining matching point pairs of SIFT feature points of the left and right views.
上述方案中, 所述预处理单元, 还包括: 模型估计子单元;  In the above solution, the preprocessing unit further includes: a model estimation subunit;
所述模型估计子单元, 配置为以所述匹配点对作为输入参数来得到所 述左右视图之间的像素坐标映射模型。  The model estimation subunit is configured to obtain the pixel coordinate mapping model between the left and right views by using the matching point pair as an input parameter.
上述方案中, 所述模型估计子单元, 配置为从由所述匹配点对构成的 集合 S 中随机选取一个数据点集进行初始化; 从所述数据点集中按照预设 阈值过滤出符合的支撑点集 Si, 作为一致集; 根据 Si的大小与预设阈值的 比对来不断选取新的数据样本及估计左右视图之间的像素坐标映射模型直 至获得最大的一致集, 根据所述最大的一致集为最终数据样本得到所需的 左右视图之间的像素坐标映射模型。  In the above solution, the model estimation subunit is configured to randomly select a data point set from the set S of the matching point pairs for initializing; and filter out the matching support points according to the preset threshold from the data point set. Collecting Si as a uniform set; continuously selecting new data samples and estimating a pixel coordinate mapping model between left and right views according to the comparison of the size of Si with a preset threshold until the largest uniform set is obtained, according to the largest uniform set For the final data sample, get the desired pixel coordinate mapping model between the left and right views.
上述方案中, 所述图像处理单元, 还包括: 模型均值获取子单元; 所述模型均值获取子单元, 配置为获取立体图像合成前最新釆集的指 定个数的多帧立体图像数据对应的左右视图之间的像素坐标映射模型, 对 所述多帧立体图像数据对应的左右视图之间的像素坐标映射模型取平均, 得到平均像素坐标映射模型。  In the above solution, the image processing unit further includes: a model mean value obtaining sub-unit; the model mean value obtaining sub-unit configured to acquire a corresponding number of multi-frame stereo image data corresponding to the latest set of stereo images before synthesis The pixel coordinate mapping model between the views averages the pixel coordinate mapping model between the left and right views corresponding to the multi-frame stereo image data to obtain an average pixel coordinate mapping model.
上述方案中, 所述图像处理单元, 还包括: 第一处理子单元和第二处 理子单元;  In the above solution, the image processing unit further includes: a first processing subunit and a second processing subunit;
所述第一处理子单元, 配置为基于所述平均像素坐标映射模型进行破 损区域的修复处理; The first processing subunit is configured to break based on the average pixel coordinate mapping model Repair processing of the damaged area;
第二处理子单元, 配置为基于所述平均像素坐标映射模型进行降噪处 理。  A second processing sub-unit configured to perform noise reduction processing based on the average pixel coordinate mapping model.
上述方案中, 所述第一处理子单元, 还配置为检测到破损区域, 基于 所述平均像素坐标映射模型确定破损区域在另一正常视图中的坐标信息, 用正常视图中相应区域的图像内容替换当前破损区域的图像内容, 对检测 到破损区域的边缘, 用左右两视图对应像素点灰度值的平均数修正。  In the above solution, the first processing sub-unit is further configured to detect a damaged area, determine coordinate information of the damaged area in another normal view based on the average pixel coordinate mapping model, and use image content of the corresponding area in the normal view. The image content of the current damaged area is replaced, and the edge of the damaged area is corrected by the average of the gray values of the corresponding pixel points in the left and right views.
上述方案中, 所述第二处理子单元, 还配置为检测到可疑噪声点, 基 于所述平均像素坐标映射模型确定可疑噪声点在另一视图中的位置区域, 进行灰度比较确定可疑噪声点是否为噪声点; 将确定的噪声点按照预定位 置区域的邻域内各像素灰度值用另一视图对应像素点的灰度值进行修正。  In the above solution, the second processing sub-unit is further configured to detect a suspected noise point, determine a location area of the suspected noise point in another view based on the average pixel coordinate mapping model, and perform gray scale comparison to determine a suspected noise point. Whether it is a noise point; the determined noise point is corrected according to the gray value of each pixel in the neighborhood of the predetermined position area by the gray value of the pixel corresponding to the other view.
所述预处理单元、 所述图像处理单元、 特征匹配子单元、 模型估计子 单元、 模型均值获取子单元、 第一处理子单元和第二处理子单元在执行处 理时, 可以釆用中央处理器(CPU, Central Processing Unit ), 数字信号处 理器 (DSP, Digital Singnal Processor )或可编程逻辑阵列 (FPGA, Field - Programmable Gate Array ) 实现。  The preprocessing unit, the image processing unit, the feature matching subunit, the model estimation subunit, the model mean acquisition subunit, the first processing subunit, and the second processing subunit may use a central processing unit when performing processing (CPU, Central Processing Unit), digital signal processor (DSP, Digital Singnal Processor) or programmable logic array (FPGA, Field-Programmable Gate Array) implementation.
本发明实施例又提供了一种计算机存储介质, 所述计算机存储介质包 括一组指令, 当执行所述指令时, 引起至少一个处理器执行所述如上述方 案任一项所述的数据处理方法。  The embodiment of the present invention further provides a computer storage medium, the computer storage medium comprising a set of instructions, when executed, causing at least one processor to execute the data processing method according to any one of the above aspects .
本发明实施例还提供了一种用户终端, 所述用户终端包括如上述的数 据处理装置。  An embodiment of the present invention further provides a user terminal, where the user terminal includes the data processing device as described above.
本发明的方法包括: 对釆集的每帧立体图像数据进行预处理时, 对每 帧立体图像数据的左视图和右视图分别提取特征值并进行匹配, 根据匹配 结果得到左右视图之间的像素坐标映射模型; 每帧立体图像数据对应一个 左右视图之间的像素坐标映射模型, 根据多帧立体图像数据对应的左右视 图之间的像素坐标映射模型得到平均像素坐标映射模型; 基于所述平均像 素坐标映射模型进行图像处理。 The method of the present invention includes: when preprocessing each frame of stereoscopic image data, extracting feature values and matching the left and right views of each frame of stereoscopic image data respectively, and obtaining pixels between the left and right views according to the matching result. Coordinate mapping model; each frame of stereoscopic image data corresponds to a pixel coordinate mapping model between left and right views, and corresponding left and right views according to multi-frame stereo image data The pixel coordinate mapping model between the images obtains an average pixel coordinate mapping model; image processing is performed based on the average pixel coordinate mapping model.
釆用本发明实施例, 由于对釆集的每帧立体图像数据进行预处理以获 取左右视图之间的像素坐标映射模型, 对该模型进行优化得到平均像素坐 标映射模型, 最终基于该平均像素坐标映射模型进行图像处理, 能对得到 的立体图像数据进行优化重建。 附图说明  According to the embodiment of the present invention, since the stereo image data of each frame of the image is preprocessed to obtain a pixel coordinate mapping model between the left and right views, the model is optimized to obtain an average pixel coordinate mapping model, and finally based on the average pixel coordinate. The mapping model performs image processing, and can optimize and reconstruct the obtained stereo image data. DRAWINGS
图 1为本发明实施例方法原理的实现流程图;  1 is a flowchart of implementing a method principle according to an embodiment of the present invention;
图 2为本发明实施例装置基本组成结构示意图;  2 is a schematic structural diagram of a basic composition of an apparatus according to an embodiment of the present invention;
图 3本发明实施例 SIFT特征提取的实现流程图;  FIG. 3 is a flowchart of implementing SIFT feature extraction according to an embodiment of the present invention; FIG.
图 4为本发明实施例估计坐标映射模型的实现流程图;  4 is a flowchart of an implementation of estimating a coordinate mapping model according to an embodiment of the present invention;
图 5为本发明实施例破损修复的实现流程图。 具体实施方式  FIG. 5 is a flowchart of implementing damage repair according to an embodiment of the present invention. detailed description
下面结合附图对技术方案的实施作进一步的详细描述。  The implementation of the technical solution will be further described in detail below with reference to the accompanying drawings.
本发明实施例的数据处理方法的应用场景是用户终端, 尤其是移动终 端的多媒体应用技术领域, 例如, 两视点立体图像釆集合成过程中两视点 棵眼立体图像数据优化重建的数据处理方案, 至少解决了现有技术中不能 得到稳定的立体图像数据的问题, 确保用户终端, 尤其是移动终端釆集立 体图像数据时不会因为摄像头遮挡、 噪声较大而导致产生的立体图像数据 不可用时仍能最终合成质量相对较高的立体图片。  The application scenario of the data processing method in the embodiment of the present invention is a user terminal, in particular, a multimedia application technology field of a mobile terminal, for example, a data processing scheme for optimizing and reconstructing two-point stereoscopic image data in a two-view stereoscopic image collection process. At least the problem that the stable stereoscopic image data cannot be obtained in the prior art is solved, and the user terminal, especially when the mobile terminal collects the stereoscopic image data, is not caused by the occlusion of the camera and the noise, and the generated stereoscopic image data is still unavailable. It is possible to finally synthesize a stereoscopic picture of relatively high quality.
本发明实施例主要基于 SIFT算法和 RANSAC算法, 是通过 RANSAC 算法利用左右视图的 SIFT匹配点建立左右视图像素映射模型, 根据该左右 视图像素映射模型, 用正常视图对破损的视图进行修复和重建, 为合成算 法提供合格的立体图像数据, 釆用本发明实施例, 最终能得到修改好的立 体图像数据, 釆用该左右视图像素映射模型对于修复破损图像较为精准, 计算量较小, 修复效果好。 The embodiment of the present invention is mainly based on the SIFT algorithm and the RANSAC algorithm. The RANSAC algorithm is used to establish the left and right view pixel mapping model by using the SIFT matching points of the left and right views, and the damaged view is repaired and reconstructed according to the left and right view pixel mapping model. Providing qualified stereoscopic image data for the synthesis algorithm, using the embodiment of the present invention, and finally obtaining the modified stereo Body image data, using the left and right view pixel mapping model is more accurate for repairing damaged images, the calculation amount is small, and the repair effect is good.
其中, SIFT算法在之前已经介绍过了,而 RANSAC算法的基本思想是: 在进行参数估计时, 不是不加区分地对待所有可能的输入数据, 而是首先 针对具体问题设计出一个搜索引擎, 利用此搜索引擎迭代地剔除那些与所 估计参数不一致的输入数据 ( Outliers ), 然后利用正确的输入数据来估计参 数。 本发明实施例釆用 RANSAC算法的具体实现来得到左右视图像素映射 模型。  Among them, the SIFT algorithm has been introduced before, and the basic idea of the RANSAC algorithm is: When performing parameter estimation, instead of treating all possible input data indiscriminately, first design a search engine for specific problems, This search engine iteratively rejects input data (Outliers) that are inconsistent with the estimated parameters and then uses the correct input data to estimate the parameters. In the embodiment of the present invention, a specific implementation of the RANSAC algorithm is used to obtain a left-right view pixel mapping model.
立体成像是基于视差创造立体的原理, 所谓视差创造立体的原理, 是 指人的两只眼睛从不同的角度观看世界, 即左眼看到的物体与右眼看到的 同一物体之间有细微的差别,人的双眼平均间距约 65mm, 因而描述场景轮 廓的方式也不尽相同。 大脑根据这两个有细微差别的场景进行综合处理 (生 理融合作用), 产生精确的三维物体感知以及该物体在场景中的定位, 这就 是具有深度的立体感。  Stereoscopic imaging is based on the principle of creating a stereoscopic parallax. The principle of creating a stereoscopic parallax is that the two eyes of a person view the world from different angles, that is, there is a slight difference between the object seen by the left eye and the same object seen by the right eye. The average distance between the eyes of the person is about 65mm, so the way of describing the outline of the scene is not the same. The brain performs comprehensive processing (physical fusion) based on these two nuanced scenes, producing accurate three-dimensional object perception and the positioning of the object in the scene, which is a deep three-dimensional sense.
立体成像系统的工作就是对每个场景至少产生两幅图像, 一幅代表左 眼所看到的图像, 另一幅代表右眼所看到的图像, 这两幅有关联的图像称 为立体图像对 (stereo pair)。 而立体显示系统必须使左眼只能看到左图像,右 眼只能看到右图像。 本发明实施例所针对的显示方法是双视点自由立体显 示方法, 即左右视点两幅图像, 通过对子像素点的重新排列组合, 生成立 体图像, 送至显示装置。 通过在 CRT显示器或者平板显示器前加入透镜柱 面或者视差栅栏, 控制各像素光线的射出方向, 使左视点的图像仅射入左 眼, 右视点的图像仅射入右眼, 利用双目视差, 产生立体视觉。  The work of the stereo imaging system is to generate at least two images for each scene, one for the image seen by the left eye and the other for the image seen by the right eye. The two associated images are called stereo images. Pair (stereo pair). The stereo display system must have the left eye only see the left image and the right eye only see the right image. The display method for the embodiment of the present invention is a two-viewpoint free-stereoscopic display method, that is, two images of left and right viewpoints, and by rearranging and combining the sub-pixel points, a stereoscopic image is generated and sent to the display device. By adding a lens cylinder or a parallax barrier in front of a CRT display or a flat panel display, the emission direction of each pixel light is controlled so that the image of the left viewpoint is only incident on the left eye, and the image of the right viewpoint is only incident on the right eye, using binocular parallax, Produce stereo vision.
立体图像合成工作包括立体数据的准备、 子像素判断准则、 各视点像 素子釆样、 各视点子像素排列合成、 立体图像的压缩传输与显示几个部分, 本发明实施例主要针对立体图像数据的准备环节, 是对得到的立体图像数 据进行优化重建, 经过本发明实施例的优化重建立体图像数据后, 即便用 户终端, 尤其是移动终端釆集立体图像数据时因为摄像头遮挡、 噪声较大 而导致产生的立体图像数据不可用时, 仍能最终合成质量相对较高的立体 图片。 The stereoscopic image synthesis work includes the preparation of the stereoscopic data, the sub-pixel determination criterion, the pixel of each view pixel, the synthesis of the sub-pixels of each view, the compression transmission and display of the stereo image, and the embodiment of the present invention is mainly directed to the stereo image data. Preparation, is the number of stereo images obtained According to the optimized reconstruction, after the stereoscopic image data is optimized and reconstructed by the embodiment of the present invention, even if the user terminal, especially the mobile terminal collects stereoscopic image data, the stereoscopic image data generated due to the occlusion of the camera and the noise is not available. It is possible to finally synthesize a stereoscopic picture of relatively high quality.
本发明实施例的数据处理方法, 如图 1所示, 包括:  The data processing method of the embodiment of the present invention, as shown in FIG. 1, includes:
步骤 101、对釆集的每帧立体图像数据进行预处理时,对每帧立体图像 数据的左视图和右视图分别提取特征值并进行匹配, 根据匹配结果得到左 右视图之间的像素坐标映射模型;  Step 101: When preprocessing each frame of stereoscopic image data, extracting feature values and matching the left and right views of each frame of stereoscopic image data respectively, and obtaining a pixel coordinate mapping model between the left and right views according to the matching result. ;
这里, 立体图像数据也可以成为立体图像素材, 不做赘述。  Here, the stereoscopic image data may also be a stereoscopic image material, and will not be described again.
步骤 102、每帧立体图像数据对应一个左右视图之间的像素坐标映射模 型, 根据多帧立体图像数据对应的左右视图之间的像素坐标映射模型得到 平均像素坐标映射模型;  Step 102: Each frame of stereoscopic image data corresponds to a pixel coordinate mapping model between left and right views, and an average pixel coordinate mapping model is obtained according to a pixel coordinate mapping model between left and right views corresponding to the multi-frame stereo image data;
步骤 103、 基于所述平均像素坐标映射模型进行图像处理。  Step 103: Perform image processing based on the average pixel coordinate mapping model.
本发明实施例的数据处理装置, 如图 2所示, 包括:  The data processing apparatus of the embodiment of the present invention, as shown in FIG. 2, includes:
预处理单元, 配置为对釆集的每帧立体图像数据进行预处理时, 对每 帧立体图像数据的左视图和右视图分别提取特征值并进行匹配, 根据匹配 结果得到左右视图之间的像素坐标映射模型。  The pre-processing unit is configured to extract and match the feature values of the left view and the right view of each frame of the stereo image data when the pre-processing of each frame of the stereo image data is performed, and obtain the pixels between the left and right views according to the matching result. Coordinate mapping model.
图像处理单元, 配置为基于所述平均像素坐标映射模型进行图像处理; 每帧立体图像数据对应一个左右视图之间的像素坐标映射模型, 根据多帧 立体图像数据对应的左右视图之间的像素坐标映射模型得到所述平均像素 坐标映射模型。  The image processing unit is configured to perform image processing based on the average pixel coordinate mapping model; each frame of stereo image data corresponds to a pixel coordinate mapping model between left and right views, and pixel coordinates between left and right views corresponding to the multi-frame stereo image data The mapping model obtains the average pixel coordinate mapping model.
本发明实施例的计算机存储介质, 所述计算机存储介质包括一组指令, 当执行所述指令时, 引起至少一个处理器执行所述数据处理方法。  A computer storage medium according to an embodiment of the present invention, the computer storage medium comprising a set of instructions, when executed, causing at least one processor to execute the data processing method.
本发明实施例的用户终端, 包括本发明实施例的数据处理装置的基本 结构及其各种变形和等同替换, 不做赘述。 以下对本发明实施例的一应用场景 (用户终端为手机拍照时的场景) 具体阐述: The user terminal of the embodiment of the present invention includes the basic structure of the data processing apparatus of the embodiment of the present invention, and various modifications and equivalents thereof are omitted. The following is an application scenario of the embodiment of the present invention (the scene when the user terminal takes a photo of the mobile phone):
在该手机拍照时的场景下, 本发明实施例的数据处理方法具体为两视 点棵眼立体图像数据优化重建处理方案, 用户使用手机拍摄立体图片时, 均需要先进行预览, 用户按下快门键前一端时间, 一般来说处于相对稳定 可靠的立体数据釆集状态, 而按下快门的过程容易出现突发噪声。 为了实 现本发明实施例的目的, 釆用本发明实施例需要在拍照前进行预处理, 以 便得到稳定的立体图像数据, 对预览图片左右视图提取 SIFT特征, 并进行 匹配, 进而确定当前场景左右视图之间像素坐标映射模型, 根据此映射模 型, 我们对单视点低质量立体数据进行遮挡修复和去噪处理, 从而生成可 靠的双视点立体数据  In the scenario when the mobile phone is photographed, the data processing method in the embodiment of the present invention is specifically a two-view eye stereo image data optimization reconstruction processing scheme. When the user uses the mobile phone to take a stereoscopic image, the user needs to preview first, and the user presses the shutter button. The first end time is generally in a relatively stable and reliable stereo data collection state, and the process of pressing the shutter is prone to sudden noise. In order to achieve the purpose of the embodiments of the present invention, the embodiment of the present invention needs to perform pre-processing before photographing, so as to obtain stable stereoscopic image data, extract SIFT features from the left and right views of the preview image, and perform matching to determine the left and right views of the current scene. Between the pixel coordinate mapping model, according to this mapping model, we perform occlusion repair and denoising on single-view low-quality stereo data to generate reliable dual-view stereo data.
以下为详细步骤说明, 其中第一步到第六步是在图像预览过程中实现 (即上述预处理的过程), 第七步到第十一步是在按下快门键进行拍照后实 现的步骤 (即基于预处理的过程得到的优化和重建的立体图像数据进行立 体图像的合成, 经破损图像修复和去噪后得到最终的结果)。  The following is a detailed step description, in which the first step to the sixth step are implemented in the image preview process (ie, the process of the above preprocessing), and the seventh step to the eleventh step are steps implemented after the shutter button is pressed to take a picture. (ie, the stereo image is obtained based on the optimized and reconstructed stereo image data obtained by the preprocessing process, and the final result is obtained after the damaged image is repaired and denoised).
第一步: 在预览时对预览左右视图图像进行预处理, 包括尺度缩放和 图像平滑。 由于手机预览模式与拍照模式设置不同, 所以先对预览模式下 釆集到的图片进行缩放,如 5Mp。由于 SIFT特征描述算法具有尺度不变性, 所以不会影响到 SIFT特征提取和匹配。 图像平滑应用高斯低通滤波器, 因 为高斯低通滤波器可以有效克服振铃效应, 并对消除噪声效果明显。 在这 一步中, 起到的有益效果为: 用高斯低通滤波器进行图像平滑处理, 主要 是为了减小尺度缩放对于图片质量的影响, 以保证坐标映射模型的可靠性。  Step 1: Pre-preview the left and right view images during preview, including scale scaling and image smoothing. Since the phone preview mode is different from the camera mode setting, the image captured in the preview mode is first scaled, such as 5Mp. Since the SIFT feature description algorithm has scale invariance, it does not affect SIFT feature extraction and matching. The Gaussian low-pass filter is applied to image smoothing because the Gaussian low-pass filter can effectively overcome the ringing effect and has an obvious effect of eliminating noise. In this step, the beneficial effects are as follows: The image smoothing process is performed by using a Gaussian low-pass filter, mainly to reduce the influence of scale scaling on the picture quality, so as to ensure the reliability of the coordinate mapping model.
第二步: 对处理过的左右视图图像提取 SIFT特征。  Step 2: Extract the SIFT features from the processed left and right view images.
这里, SIFT匹配的流程如图 3所示,左视图的操作步骤(步骤 111-116 ) 和右视图的操作步骤(步骤 121-126 )是一样的处理, 以左视图为例, 包括: 步骤 111、 准备提取图像的 SIFT特征; Here, the flow of the SIFT matching is as shown in FIG. 3. The operation steps of the left view (steps 111-116) and the operation steps of the right view (steps 121-126) are the same processing. Taking the left view as an example, the method includes: Step 111: Prepare an SIFT feature for extracting an image;
步骤 112、 构建尺度空间;  Step 112: Construct a scale space;
步骤 113、 检测空间极值点;  Step 113: detecting a spatial extreme point;
步骤 114、 精确定位空间极值点;  Step 114: accurately locate the spatial extreme point;
步骤 115、 去除边缘响应点;  Step 115: Removing an edge response point;
步骤 116、 生成 SIFT特征描述子。  Step 116: Generate a SIFT feature descriptor.
由左视图的操作步骤(步骤 111-116 )得到 SIFT特征描述子, 以及通 过右视图的操作步骤 (步骤 121-126 )得到 SIFT特征描述子后, 执行步骤 13。  The SIFT feature descriptor is obtained from the operation steps (steps 111-116) of the left view, and the SIFT feature descriptor is obtained by the operation steps (steps 121-126) of the right view, and step 13 is performed.
步骤 13、最终通过对由左视图的操作步骤(步骤 111-116 )得到的 SIFT 特征描述子, 与通过右视图的操作步骤(步骤 121-126 )得到的 SIFT特征 描述子进行 SIFT特征点匹配, 以便获取左右视图的 SIFT特征点匹配点对, 后续简称为匹配点对。  Step 13, finally performing SIFT feature point matching by using the SIFT feature descriptor obtained by the operation steps (steps 111-116) of the left view and the SIFT feature descriptor obtained by the operation steps of the right view (steps 121-126). In order to obtain the SIFT feature point matching point pairs of the left and right views, which are simply referred to as matching point pairs.
综上所述, 图 3流程的基本点主要为:  In summary, the basic points of the process in Figure 3 are:
( a ) 检测尺度空间极值点, 包括尺度空间的生成, 空间极值点检测。 具体的, SIFT使用了高斯差分(DoG )尺度空间, 它由相邻尺度的高 斯差分核与输入图像卷积而成。 DoG核不仅是对 LoG的线性近似, 而且大 大简化了尺度空间的计算。 对每个像素点在其图像空间和 DoG尺度空间的 邻域中搜索极值点, 初步得到特征点的位置。 中间的待检测点和它同尺度 的 8个相邻点和上下相邻尺度对应的 9 X 2个点共 26个点构成的邻域中比 较,以确保在尺度空间和二维图像空间都检测到极值点。一个点如果在 DOG 尺度空间本层以及上下两层的 26个领域中是最大或最小值时, 就认为该点 是图像在该尺度下的一个特征点。  (a) Detecting scale space extreme points, including the generation of scale space, and the detection of spatial extreme points. Specifically, SIFT uses a Gaussian difference (DoG) scale space, which is formed by convolving a Gaussian differential kernel of adjacent dimensions with an input image. The DoG core is not only a linear approximation of LoG, but also greatly simplifies the calculation of scale space. For each pixel, the extreme point is searched in the neighborhood of its image space and DoG scale space, and the position of the feature point is initially obtained. The middle to-be-detected point is compared with its neighboring fields of 8 adjacent points of the same scale and 9 X 2 points corresponding to the upper and lower adjacent scales to ensure detection in both the scale space and the two-dimensional image space. To the extreme point. If a point is the largest or smallest value in the DOG scale space layer and the 26 fields in the upper and lower layers, the point is considered to be a feature point of the image at the scale.
( b )精确定位空间极值点, 包括去除低对比度的关键点, 去除边缘响 应点。 ( c ) 生成 128维 SIFT特征描述子。 (b) Precisely locate spatial extreme points, including removing low contrast key points and removing edge response points. (c) Generate a 128-dimensional SIFT feature descriptor.
具体的, 实际计算过程中, 为了增强匹配的稳健性, 对每个关键点使 用 4 X 4共 16个种子点来描述, 每个种子点有 8个方向向量信息, 这样对 于一个关键点就可以产生 128个数据,即最终形成 128维的 SIFT特征向量。  Specifically, in the actual calculation process, in order to enhance the robustness of the matching, a total of 16 seed points of 4 X 4 are used for each key point to describe, and each seed point has 8 direction vector information, so that for a key point, 128 data is generated, which ultimately forms a 128-dimensional SIFT feature vector.
第三步: 对左右视图的 SIFT特征点进行匹配。 当两幅图像(图像 1和 图像 2 ) 的 SIFT特征向量生成后, 下一步釆用关键点特征向量的欧式距离 作为两幅图像中关键点的相似性判定度量。 欧氏距离定义如公式 1所示: (Fa , Fb ) = ^∑:._H ) )2 公式 取左视图中的某个关键点, 并找出其与图像 2 中欧式距离最近的前两 个关键点, 然后按照 distance-ratio 准则进行匹配, 即在这两个关键点中, 如果最近的距离 Λ除以次近的距离 ^2得到 mtio, 若 rat。大于某个比例阈值 ε , 则接受这一对匹配点。 该比例阈值根据实验结果可以是 1.5, 该比例阈 值取 1.5可以保证匹配点对数量足够用, 又可以有效控制计算量, 匹配点对 数量随该比例阈值提高而降低, 降低这个比例阈值, SIFT匹配点数目会减 少, 但更加稳定。 定义如公式 2所示。 Step 3: Match the SIFT feature points of the left and right views. When the SIFT feature vectors of the two images (image 1 and image 2) are generated, the Euclidean distance of the key feature vector is used as the similarity determination metric of the key points in the two images. The Euclidean distance is defined as shown in Equation 1: (F a , F b ) = ^∑:._H ) ) 2 The formula takes a key point in the left view and finds it closest to the Euclidean distance in Image 2. Two key points, then match according to the distance-ratio criterion, that is, in these two key points, if the nearest distance is divided by the nearest distance ^2 to get mtio, if rat. If it is greater than a certain threshold ε, then the pair is accepted. The proportional threshold can be 1.5 according to the experimental result. The ratio threshold is 1.5 to ensure that the number of matching points is sufficient, and the amount of calculation can be effectively controlled. The number of matching points decreases as the threshold increases, and the ratio threshold is lowered, SIFT matching. The number of points will decrease, but it will be more stable. The definition is as shown in Equation 2.
ratio = d\ / d2  Ratio = d\ / d2
ratio > ε, success; 公式 2 Ratio > ε, success; Equation 2
Else, failure Else, failure
随着 ra i。取值的提高, SIFT匹配点数量会下降, 但精度会提高。 由于 在本发明实施例中对 SIFT匹配点对的精度要求较高, 而且左右视图差异较 小, 会造成匹配点数量过高, 使得计算量太大。 所以本例中 的取值应 该适当增高, 在本实现中, ra ii。取值为 1.8。  With ra i. As the value increases, the number of SIFT matching points will decrease, but the accuracy will increase. Since the accuracy of the SIFT matching point pair is high in the embodiment of the present invention, and the difference between the left and right views is small, the number of matching points is too high, so that the calculation amount is too large. Therefore, the value in this example should be increased appropriately. In this implementation, ra ii. The value is 1.8.
记录匹配点对的坐标信息。  Record the coordinate information of the matching point pair.
也可以用右视图中欧式距离最近的点作为当前左视图 SIFT关键点的匹 配点, 记录匹配点对的坐标信息。 第四步: 用 RANSAC算法估计当前场景左右视图之间的像素坐标映射 模型。 It is also possible to use the point closest to the Euclidean distance in the right view as the matching point of the current left view SIFT key point, and record the coordinate information of the matching point pair. Step 4: Estimate the pixel coordinate mapping model between the left and right views of the current scene using the RANSAC algorithm.
这里, 估计左右视图之间的像素坐标映射模型的流程如图 4所示, 包 括:  Here, the flow of the pixel coordinate mapping model between the left and right views is estimated as shown in FIG. 4, which includes:
步骤 201、 随机选取 8组匹配点对, 用于初始化该左右视图之间的像素 坐标映射模型。  Step 201: Randomly select 8 sets of matching point pairs for initializing the pixel coordinate mapping model between the left and right views.
步骤 202、 找出当前模型的支撑点集。  Step 202: Find a set of support points of the current model.
步骤 203、 判断支撑点集的大小是否满足阈值, 如果是, 则执行步骤 204, 否则, 执行步骤 201。  Step 203: Determine whether the size of the support point set meets a threshold. If yes, execute step 204. Otherwise, perform step 201.
这里, 所述阈值根据实验结果可以是整个数据点集的百分之六十七, 阈值取整个数据点集的百分之六十七能保证模型的有效性。  Here, the threshold may be 67% of the entire data point set according to the experimental result, and the threshold is 67% of the entire data point set to ensure the validity of the model.
步骤 204、 估计出该左右视图之间的像素坐标映射模型。  Step 204: Estimate a pixel coordinate mapping model between the left and right views.
图 4的流程是把匹配点对分别在左右视图中的坐标信息作为 RANSAC 模型估计系统的输入参数, 来估计出左右视图像素映射模型。  The flow of Fig. 4 is to estimate the left and right view pixel mapping model by using the coordinate information of the matching point pairs in the left and right views as the input parameters of the RANSAC model estimation system.
综上所述, 图 4流程的基本点主要为:  In summary, the basic points of the process in Figure 4 are:
( a )从匹配点对集合 S中随机选取一个数据点集, 并由这个子集初始 化模型。  (a) randomly selecting a set of data points from the matching point pair set S, and initializing the model from the subset.
( b )找出按照阈值 Td成为当前模型的支撑点集 Si, 集合 Si就是样本 的一致集, 被定义为有效点;  (b) Find a set of support points Si that becomes the current model according to the threshold Td, and the set Si is a consistent set of samples, which is defined as an effective point;
( c )如果集合 Si的大小超过了某个阈值 T, 用 Si重新估计模型并结 束;  (c) If the size of the set Si exceeds a certain threshold T, re-estimate the model with Si and end;
( d )如果集合 Si的大小小于阈值 Ts, 选取一个新的样本, 重复上面 的步骤;  (d) If the size of the set Si is less than the threshold Ts, select a new sample and repeat the above steps;
经过 N次尝试, 最大的一致集 Si被选中, 用它来重新估计模型, 得到 最后的结果。本例中左右视图像素坐标信息分别为 P1和 P2, 由对极几何原 理可知, 左右视图通过两幅图像的基本矩阵 F相关联, 满足 P2TFP1 = 0。 该步骤中模型估计的过程实际上便是对 F求解的过程, 由于匹配点对足够 多, 所以能够生成可靠的坐标映射模型。 After N attempts, the largest uniform set Si was selected and used to re-estimate the model to get the final result. In this example, the pixel coordinate information of the left and right views are P1 and P2, respectively. It can be seen that the left and right views are associated by the basic matrix F of the two images, satisfying P2TFP1 = 0. The process of model estimation in this step is actually the process of solving F. Since there are enough matching points, a reliable coordinate mapping model can be generated.
第五步: 仅保留符合步骤四中左右视图之间的像素坐标映射模型的正 确匹配点对, 即 Si中的点集, 与得到的左右视图像素坐标映射模型一同保 存为本帧图像参考信息。  Step 5: Only the correct matching point pairs that match the pixel coordinate mapping model between the left and right views in step 4, that is, the point set in Si, are saved together with the obtained left and right view pixel coordinate mapping models as the frame image reference information.
第六步: 无限循环第一步到第五步, 仅保留最新釆集到的四帧图像的 图像参考信息, 建立队列。 每釆集一帧立体图像数据, 将队头帧信息删除, 将最新帧参考信息存入队尾。  Step 6: The first step to the fifth step of the infinite loop, only retain the image reference information of the latest four frames of images, and establish a queue. Each frame collects one frame of stereo image data, deletes the frame header information, and stores the latest frame reference information at the end of the queue.
第七步: 拍照, 并对左右视图进行平滑处理, 用当前左右视图之间的 像素坐标映射模型队列前三帧求解平均像素坐标映射模型。 拍照后, 由用 户选择是否需要修复视图, 若需要, 则继续。  Step 7: Take a picture, and smooth the left and right views, and solve the average pixel coordinate mapping model with the first three frames of the pixel coordinate mapping model between the current left and right views. After taking a photo, the user selects whether the view needs to be repaired, and if necessary, continues.
第八步: 验证是否遮挡。 将左右视图各自平均切分为 8块, 对每一块 取平均灰度值, 对左右视图相应区块的平均灰度值进行对比, 若区块平均 灰度值相对差异在 10%以上, 则将低灰度值区块视为破损区块, 若破损区 块数为 0, 则跳至第十步。  Step 8: Verify that it is occluded. The average of the left and right views is divided into 8 blocks, and the average gray value is taken for each block, and the average gray value of the corresponding block in the left and right views is compared. If the relative difference of the average gray value of the block is more than 10%, then The low gray value block is regarded as a damaged block. If the number of broken blocks is 0, then jump to the tenth step.
第九步: 修复遮挡区域。  Step 9: Repair the occlusion area.
这里, 破损修复流程如图 5所示, 包括:  Here, the damage repair process is shown in Figure 5, including:
步骤 301、 精确检测破损区域( sobel算子)。  Step 301, accurately detecting the damaged area (sobel operator).
步骤 302、 计算破损区域在正常视图中的坐标信息。  Step 302: Calculate coordinate information of the damaged area in a normal view.
步骤 303、 修复破损区域。  Step 303: Repair the damaged area.
步骤 304、 修复边缘。  Step 304, repair the edge.
综上所述, 图 5流程的基本点主要为:  In summary, the basic points of the flow of Figure 5 are:
( a ) 精确确定破损区域。 在破损区块中, 用 sobel算子检测灰度突变 边缘。 该算子包含两组 3 x 3矩阵, 分别为横向和纵向, 将之与图像做平面 卷积, 即可分别得出横向及纵向的亮度差分近似值。 (a) Accurately identify the damaged area. In the damaged block, the edge of the gray-scale mutation is detected by the sobel operator. The operator consists of two sets of 3 x 3 matrices, horizontal and vertical, which are planed to the image. Convolution, you can get the horizontal and vertical brightness difference approximation separately.
( b )用当前场景左右视图坐标映射模型确定破损区域在另一视图中坐 标信息。  (b) Determine the coordinate information of the damaged area in another view using the current scene left and right view coordinate mapping model.
( c ) 用正常视图中相应区域图像内容替换当前破损区域图像内容。  ( c ) Replace the current damaged area image content with the corresponding area image content in the normal view.
( d )边缘修复。 对(a ) 中检测到的边缘, 每像素点 3 x 3临域内各像 素灰度值用左右两视图对应像素点灰度值的平均数修正。  (d) Edge repair. For the edge detected in (a), the gray value of each pixel in the 3 x 3 area of each pixel is corrected by the average of the gray values of the corresponding pixel points in the left and right views.
第十步: 噪声点检测。 在拍照过程中, 左右视图均可能伴有一定的椒 盐噪声, 同时在第九步过后, 由于坐标映射模型的局限性, 将伴随有少量 椒盐噪声, 本例用中值滤波器检测左右视图的噪声, 并对噪声点加以标注。 即在当前点 Ν χ Ν临域内 (N为奇数), 取灰度的最大值、 最小值和均值, 如果当前点的灰度值是这个临域内的最大或最小值, 且超过设定阈值(该 邻域内平均灰度值 60%-150%为基本阈值, 此范围之外为超过设定阈值), 则有可能为噪点, 标记为可疑点。 此时通过坐标映射模型确定可疑点在另 一视图中位置区域, 将当前点置于此位置中, 再次进行灰度比较, 进而确 定当前点是否为噪声点。  Step 10: Noise point detection. During the photo taking process, the left and right views may be accompanied by a certain amount of salt and pepper noise. At the same time, after the ninth step, due to the limitation of the coordinate mapping model, a small amount of salt and pepper noise will be accompanied. In this example, the median filter is used to detect the noise of the left and right views. , and mark the noise points. That is, in the current point Ν Ν ( (N is an odd number), the maximum value, the minimum value and the mean value of the gradation are taken, if the gray value of the current point is the maximum or minimum value within the region, and exceeds the set threshold ( If the average gray value in the neighborhood is 60%-150% is the basic threshold, and outside the range is the set threshold, it may be noise and marked as suspicious. At this time, the coordinate mapping model is used to determine the position area of the suspicious point in another view, the current point is placed in this position, and the gray level comparison is performed again to determine whether the current point is a noise point.
第十一步: 修复噪声点。 将第十步中确认的噪声点 3 x 3邻域内各像素 灰度值用另一视点对应像素点的灰度值修正。  Step 11: Fix the noise point. The gray value of each pixel in the neighborhood of the noise point 3 x 3 confirmed in the tenth step is corrected by the gray value of the pixel corresponding to the other viewpoint.
第十二步: 将优化后的立体图像数据提交到合成算法, 利用现有的合 成算法进行立体图像的合成。  Step 12: Submit the optimized stereo image data to the synthesis algorithm, and use the existing synthesis algorithm to synthesize the stereo image.
本发明实施例所述集成的模块如果以软件功能模块的形式实现并作为 独立的产品销售或使用时, 也可以存储在一个计算机可读取存储介质中。 基于这样的理解, 本发明实施例的技术方案本质上或者说对现有技术做出 贡献的部分可以以软件产品的形式体现出来, 该计算机软件产品存储在一 个存储介质中, 包括若干指令用以使得一台计算机设备(可以是个人计算 机、 服务器、 或者网络设备等)执行本发明各个实施例所述方法的全部或 部分。 而前述的存储介质包括: U盘、 移动硬盘、 只读存储器 (ROM, Read-Only Memory ), 随机存取存 4诸器 ( RAM, Random Access Memory )、 磁碟或者光盘等各种可以存储程序代码的介质。 这样, 本发明实施例不限 制于任何特定的硬件和软件结合。 The integrated modules described in the embodiments of the present invention may also be stored in a computer readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product. The computer software product is stored in a storage medium and includes a plurality of instructions. Enabling a computer device (which may be a personal computer, server, or network device, etc.) to perform all of the methods described in various embodiments of the present invention or section. The foregoing storage medium includes: a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like. The medium of the code. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
以上所述, 仅为本发明的较佳实施例而已, 并非用于限定本发明的保 护范围。 工业实用性  The above is only the preferred embodiment of the present invention and is not intended to limit the scope of the present invention. Industrial applicability
釆用本发明实施例, 由于对釆集的每帧立体图像数据进行预处理以获 取左右视图之间的像素坐标映射模型, 对该模型进行优化得到平均像素坐 标映射模型, 最终基于该平均像素坐标映射模型进行图像处理, 能对得到 的立体图像数据进行优化重建。  According to the embodiment of the present invention, since the stereo image data of each frame of the image is preprocessed to obtain a pixel coordinate mapping model between the left and right views, the model is optimized to obtain an average pixel coordinate mapping model, and finally based on the average pixel coordinate. The mapping model performs image processing, and can optimize and reconstruct the obtained stereo image data.

Claims

权利要求书 claims
1、 一种数据处理方法, 所述方法包括: 1. A data processing method, the method includes:
对釆集的每帧立体图像数据进行预处理时, 对每帧立体图像数据的左 视图和右视图分别提取特征值并进行匹配, 根据匹配结果得到左右视图之 间的像素坐标映射模型; When preprocessing each frame of stereo image data collected, feature values are extracted from the left view and right view of each frame of stereo image data and matched, and the pixel coordinate mapping model between the left and right views is obtained based on the matching results;
每帧立体图像数据对应一个左右视图之间的像素坐标映射模型, 根据 多帧立体图像数据对应的左右视图之间的像素坐标映射模型得到平均像素 坐标映射模型; Each frame of stereoscopic image data corresponds to a pixel coordinate mapping model between left and right views, and an average pixel coordinate mapping model is obtained based on the pixel coordinate mapping model between left and right views corresponding to multiple frames of stereoscopic image data;
基于所述平均像素坐标映射模型进行图像处理。 Image processing is performed based on the average pixel coordinate mapping model.
2、 根据权利要求 1所述的方法, 其中, 所述对每帧立体图像数据的左 视图和右视图分别提取特征值并进行匹配, 具体包括: 2. The method according to claim 1, wherein the feature values are respectively extracted and matched for the left view and right view of each frame of stereoscopic image data, specifically including:
对左视图和右视图分别提取 SIFT特征, 获得左视图和右视图分别对应 的 SIFT特征描述子; Extract SIFT features for the left view and the right view respectively, and obtain the SIFT feature descriptors corresponding to the left view and the right view respectively;
将左视图和右视图分别对应的 SIFT特征描述子进行 SIFT特征点匹配, 获得左右视图的 SIFT特征点的匹配点对。 Match the SIFT feature descriptors corresponding to the left and right views respectively to obtain matching point pairs of SIFT feature points for the left and right views.
3、 根据权利要求 2所述的方法, 其中, 所述根据匹配结果得到左右视 图之间的像素坐标映射模型, 具体包括: 3. The method according to claim 2, wherein the pixel coordinate mapping model between the left and right views is obtained according to the matching result, specifically including:
以所述匹配点对作为输入参数来得到所述左右视图之间的像素坐标映 射模型。 The pixel coordinate mapping model between the left and right views is obtained using the matching point pair as an input parameter.
4、 根据权利要求 3所述的方法, 其中, 所述以所述匹配点对作为输入 参数来得到所述左右视图之间的像素坐标映射模型, 具体包括: 4. The method according to claim 3, wherein the matching point pair is used as an input parameter to obtain the pixel coordinate mapping model between the left and right views, specifically including:
从由所述匹配点对构成的集合 S中随机选取一个数据点集进行初始化; 从所述数据点集中按照预设阈值过滤出符合的支撑点集 Si, 作为一致 根据 Si的大小与预设阈值的比对来不断选取新的数据样本及估计左右 视图之间的像素坐标映射模型直至获得最大的一致集, 根据所述最大的一 致集为最终数据样本得到所需的左右视图之间的像素坐标映射模型。 Randomly select a data point set from the set S composed of the matching point pairs for initialization; filter out the matching support point set Si from the data point set according to the preset threshold, as consistent with the size of Si and the preset threshold Comparison to continuously select new data samples and estimate the approximate The pixel coordinate mapping model between views is used until the largest consistent set is obtained, and the required pixel coordinate mapping model between the left and right views is obtained for the final data sample based on the largest consistent set.
5、 根据权利要求 1至 4任一项所述的方法, 其中, 所述根据多帧立体 图像数据对应的左右视图之间的像素坐标映射模型得到平均像素坐标映射 模型, 具体包括: 5. The method according to any one of claims 1 to 4, wherein the average pixel coordinate mapping model is obtained based on the pixel coordinate mapping model between the left and right views corresponding to the multi-frame stereo image data, specifically including:
获取立体图像合成前最新釆集的指定个数的多帧立体图像数据对应的 左右视图之间的像素坐标映射模型, 对所述多帧立体图像数据对应的左右 视图之间的像素坐标映射模型取平均, 得到平均像素坐标映射模型。 Obtain the pixel coordinate mapping model between the left and right views corresponding to the latest specified number of multi-frame stereo image data collected before stereoscopic image synthesis, and obtain the pixel coordinate mapping model between the left and right views corresponding to the multi-frame stereo image data. On average, the average pixel coordinate mapping model is obtained.
6、 根据权利要求 5所述的方法, 其中, 所述基于所述平均像素坐标映 射模型进行图像处理, 具体包括: 6. The method according to claim 5, wherein the image processing based on the average pixel coordinate mapping model specifically includes:
基于所述平均像素坐标映射模型进行破损区域的修复处理; Perform repair processing of damaged areas based on the average pixel coordinate mapping model;
基于所述平均像素坐标映射模型进行降噪处理。 Noise reduction processing is performed based on the average pixel coordinate mapping model.
7、 根据权利要求 6所述的方法, 其中, 所述基于所述平均像素坐标映 射模型进行破损区域的修复处理, 具体包括: 7. The method according to claim 6, wherein the repair processing of the damaged area based on the average pixel coordinate mapping model specifically includes:
检测到破损区域, 基于所述平均像素坐标映射模型确定破损区域在另 一正常视图中的坐标信息, 用正常视图中相应区域的图像内容替换当前破 损区域的图像内容, 对检测到破损区域的边缘, 用左右两视图对应像素点 灰度值的平均数修正。 The damaged area is detected, the coordinate information of the damaged area in another normal view is determined based on the average pixel coordinate mapping model, the image content of the current damaged area is replaced with the image content of the corresponding area in the normal view, and the edge of the damaged area is detected , corrected by the average of the gray value of the corresponding pixels in the left and right views.
8、 根据权利要求 6所述的方法, 其中, 所述基于所述平均像素坐标映 射模型进行降噪处理, 具体包括: 8. The method according to claim 6, wherein the noise reduction processing based on the average pixel coordinate mapping model specifically includes:
检测到可疑噪声点, 基于所述平均像素坐标映射模型确定可疑噪声点 在另一视图中的位置区域, 进行灰度比较确定可疑噪声点是否为噪声点; 将确定的噪声点按照预定位置区域的邻域内各像素灰度值用另一视图 对应像素点的灰度值进行修正。 When a suspicious noise point is detected, the location area of the suspicious noise point in another view is determined based on the average pixel coordinate mapping model, and a gray scale comparison is performed to determine whether the suspicious noise point is a noise point; the determined noise point is determined according to the predetermined location area The gray value of each pixel in the neighborhood is corrected with the gray value of the corresponding pixel in another view.
9、 一种数据处理装置, 所述装置包括: 预处理单元, 配置为对釆集的每帧立体图像数据进行预处理时, 对每 帧立体图像数据的左视图和右视图分别提取特征值并进行匹配, 根据匹配 结果得到左右视图之间的像素坐标映射模型; 9. A data processing device, the device includes: The preprocessing unit is configured to, when preprocessing each frame of stereoscopic image data collected, extract feature values from the left view and right view of each frame of stereoscopic image data respectively and perform matching, and obtain the pixels between the left and right views based on the matching results. Coordinate mapping model;
图像处理单元, 配置为基于所述平均像素坐标映射模型进行图像处理; 每帧立体图像数据对应一个左右视图之间的像素坐标映射模型, 根据多帧 立体图像数据对应的左右视图之间的像素坐标映射模型得到所述平均像素 坐标映射模型。 An image processing unit configured to perform image processing based on the average pixel coordinate mapping model; each frame of stereoscopic image data corresponds to a pixel coordinate mapping model between left and right views, based on the pixel coordinates between left and right views corresponding to multiple frames of stereoscopic image data. The mapping model obtains the average pixel coordinate mapping model.
10、 根据权利要求 9所述的装置, 其中, 所述预处理单元, 还包括: 特征匹配子单元; 10. The device according to claim 9, wherein the preprocessing unit further includes: a feature matching subunit;
所述特征匹配子单元, 配置为对左视图和右视图分别提取 SIFT特征, 获得左视图和右视图分别对应的 SIFT特征描述子; 将左视图和右视图分别 对应的 SIFT特征描述子进行 SIFT特征点匹配,获得左右视图的 SIFT特征 点的匹配点对。 The feature matching subunit is configured to extract SIFT features for the left view and the right view respectively, and obtain SIFT feature descriptors corresponding to the left view and the right view respectively; and perform SIFT features on the SIFT feature descriptors corresponding to the left view and the right view respectively. Point matching obtains matching point pairs of SIFT feature points of the left and right views.
11、 根据权利要求 10所述的装置, 其中, 所述预处理单元, 还包括: 模型估计子单元; 11. The device according to claim 10, wherein the preprocessing unit further includes: a model estimation subunit;
所述模型估计子单元, 配置为以所述匹配点对作为输入参数来得到所 述左右视图之间的像素坐标映射模型。 The model estimation subunit is configured to obtain the pixel coordinate mapping model between the left and right views using the matching point pair as an input parameter.
12、 根据权利要求 11所述的装置, 其中, 所述模型估计子单元, 配置 为从由所述匹配点对构成的集合 S 中随机选取一个数据点集进行初始化; 从所述数据点集中按照预设阈值过滤出符合的支撑点集 Si, 作为一致集; 根据 Si的大小与预设阈值的比对来不断选取新的数据样本及估计左右视图 之间的像素坐标映射模型直至获得最大的一致集, 根据所述最大的一致集 为最终数据样本得到所需的左右视图之间的像素坐标映射模型。 12. The device according to claim 11, wherein the model estimation subunit is configured to randomly select a data point set from the set S composed of the matching point pairs for initialization; from the data point set according to The preset threshold filters out the matching support point set Si as a consistent set; based on the comparison of the size of Si with the preset threshold, new data samples are continuously selected and the pixel coordinate mapping model between the left and right views is estimated until the maximum consistency is obtained. set, and obtain the required pixel coordinate mapping model between the left and right views for the final data sample based on the largest consistent set.
13、 根据权利要求 9至 12任一项所述的装置, 其中, 所述图像处理单 元, 还包括: 模型均值获取子单元; 所述模型均值获取子单元, 配置为获取立体图像合成前最新釆集的指 定个数的多帧立体图像数据对应的左右视图之间的像素坐标映射模型, 对 所述多帧立体图像数据对应的左右视图之间的像素坐标映射模型取平均, 得到平均像素坐标映射模型。 13. The device according to any one of claims 9 to 12, wherein the image processing unit further includes: a model mean acquisition subunit; The model mean acquisition subunit is configured to obtain a pixel coordinate mapping model between the left and right views corresponding to the specified number of multi-frame stereo image data latest collected before stereoscopic image synthesis, and the multi-frame stereo image data corresponding to The pixel coordinate mapping models between the left and right views are averaged to obtain the average pixel coordinate mapping model.
14、根据权利要求 13所述的装置, 其中, 所述图像处理单元, 还包括: 第一处理子单元和第二处理子单元; 14. The device according to claim 13, wherein the image processing unit further includes: a first processing subunit and a second processing subunit;
所述第一处理子单元, 配置为基于所述平均像素坐标映射模型进行破 损区域的修复处理; The first processing subunit is configured to perform repair processing of the damaged area based on the average pixel coordinate mapping model;
第二处理子单元, 配置为基于所述平均像素坐标映射模型进行降噪处 理。 The second processing subunit is configured to perform noise reduction processing based on the average pixel coordinate mapping model.
15、 根据权利要求 14所述的装置, 其中, 所述第一处理子单元, 还配 置为检测到破损区域, 基于所述平均像素坐标映射模型确定破损区域在另 一正常视图中的坐标信息, 用正常视图中相应区域的图像内容替换当前破 损区域的图像内容, 对检测到破损区域的边缘, 用左右两视图对应像素点 灰度值的平均数修正。 15. The device according to claim 14, wherein the first processing subunit is further configured to detect the damaged area and determine the coordinate information of the damaged area in another normal view based on the average pixel coordinate mapping model, Replace the image content of the current damaged area with the image content of the corresponding area in the normal view. For the edge of the detected damaged area, use the average of the gray value of the corresponding pixels in the left and right views to correct it.
16、 根据权利要求 14所述的装置, 其中, 所述第二处理子单元, 还配 置为检测到可疑噪声点, 基于所述平均像素坐标映射模型确定可疑噪声点 在另一视图中的位置区域, 进行灰度比较确定可疑噪声点是否为噪声点; 将确定的噪声点按照预定位置区域的邻域内各像素灰度值用另一视图对应 像素点的灰度值进行修正。 16. The device according to claim 14, wherein the second processing subunit is further configured to detect a suspicious noise point and determine the location area of the suspicious noise point in another view based on the average pixel coordinate mapping model. , perform grayscale comparison to determine whether the suspicious noise point is a noise point; correct the determined noise point according to the grayscale value of each pixel in the neighborhood of the predetermined position area with the grayscale value of the corresponding pixel point in another view.
17、 一种计算机存储介质, 所述计算机存储介质包括一组指令, 当执 行所述指令时, 引起至少一个处理器执行所述如权利要求 1至 8任一项所 述的数据处理方法。 17. A computer storage medium, the computer storage medium includes a set of instructions that, when executed, cause at least one processor to execute the data processing method according to any one of claims 1 to 8.
18、 一种用户终端, 所述用户终端包括如权利要求 9 至 16任一项所述 的数据处理装置。 18. A user terminal, the user terminal comprising the data processing device according to any one of claims 9 to 16.
PCT/CN2014/075991 2013-10-22 2014-04-22 Data processing method, apparatus, computer storage medium and user terminal WO2014180255A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310500005.6 2013-10-22
CN201310500005.6A CN104574331B (en) 2013-10-22 2013-10-22 A kind of data processing method, device, computer storage medium and user terminal

Publications (1)

Publication Number Publication Date
WO2014180255A1 true WO2014180255A1 (en) 2014-11-13

Family

ID=51866712

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/075991 WO2014180255A1 (en) 2013-10-22 2014-04-22 Data processing method, apparatus, computer storage medium and user terminal

Country Status (2)

Country Link
CN (1) CN104574331B (en)
WO (1) WO2014180255A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107786866A (en) * 2017-09-30 2018-03-09 深圳睛灵科技有限公司 A kind of binocular vision image synthesis system and method
CN110147598A (en) * 2019-05-10 2019-08-20 上海理工大学 The modeling of hypervelocity impact panus and method for estimating damage based on image procossing
CN110321858A (en) * 2019-07-08 2019-10-11 北京字节跳动网络技术有限公司 Video similarity determines method, apparatus, electronic equipment and storage medium
CN112562063A (en) * 2020-12-08 2021-03-26 北京百度网讯科技有限公司 Method, device, equipment and storage medium for carrying out three-dimensional attempt on object
CN113362233A (en) * 2020-03-03 2021-09-07 浙江宇视科技有限公司 Picture processing method, device, equipment, system and storage medium
CN113421248A (en) * 2021-06-30 2021-09-21 上海申瑞继保电气有限公司 Transformer substation equipment rotation image numerical value processing method
CN116630203A (en) * 2023-07-19 2023-08-22 科大乾延科技有限公司 Integrated imaging three-dimensional display quality improving method

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10122996B2 (en) * 2016-03-09 2018-11-06 Sony Corporation Method for 3D multiview reconstruction by feature tracking and model registration
WO2019019160A1 (en) * 2017-07-28 2019-01-31 深圳配天智能技术研究院有限公司 Method for acquiring image information, image processing device, and computer storage medium
CN110599513B (en) * 2019-09-04 2022-02-11 南京邮电大学 Binocular vision image edge detection and target tracking method
CN112215871B (en) * 2020-09-29 2023-04-21 武汉联影智融医疗科技有限公司 Moving target tracking method and device based on robot vision

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101179745A (en) * 2007-12-05 2008-05-14 宁波大学 Preprocessing method of multi-viewpoint image
CN102096915A (en) * 2011-02-09 2011-06-15 北京航空航天大学 Camera lens cleaning method based on precise image splicing
CN103345736A (en) * 2013-05-28 2013-10-09 天津大学 Virtual viewpoint rendering method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1214636C (en) * 2002-05-28 2005-08-10 华为技术有限公司 Method for shielding errors happening in video signal
US7382969B2 (en) * 2003-02-19 2008-06-03 Sony Corporation Method and system for preventing the unauthorized copying of video content
JP4737573B2 (en) * 2009-02-05 2011-08-03 富士フイルム株式会社 3D image output apparatus and method
CN102081687B (en) * 2010-12-30 2013-06-12 上海电机学院 Three-dimensional hairstyling method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101179745A (en) * 2007-12-05 2008-05-14 宁波大学 Preprocessing method of multi-viewpoint image
CN102096915A (en) * 2011-02-09 2011-06-15 北京航空航天大学 Camera lens cleaning method based on precise image splicing
CN103345736A (en) * 2013-05-28 2013-10-09 天津大学 Virtual viewpoint rendering method

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107786866A (en) * 2017-09-30 2018-03-09 深圳睛灵科技有限公司 A kind of binocular vision image synthesis system and method
CN107786866B (en) * 2017-09-30 2020-05-19 深圳睛灵科技有限公司 Binocular vision image synthesis system and method
CN110147598A (en) * 2019-05-10 2019-08-20 上海理工大学 The modeling of hypervelocity impact panus and method for estimating damage based on image procossing
CN110147598B (en) * 2019-05-10 2023-08-22 上海理工大学 Ultrahigh-speed impact fragment cloud modeling and damage evaluation method based on image processing
CN110321858A (en) * 2019-07-08 2019-10-11 北京字节跳动网络技术有限公司 Video similarity determines method, apparatus, electronic equipment and storage medium
CN113362233A (en) * 2020-03-03 2021-09-07 浙江宇视科技有限公司 Picture processing method, device, equipment, system and storage medium
CN113362233B (en) * 2020-03-03 2023-08-29 浙江宇视科技有限公司 Picture processing method, device, equipment, system and storage medium
CN112562063A (en) * 2020-12-08 2021-03-26 北京百度网讯科技有限公司 Method, device, equipment and storage medium for carrying out three-dimensional attempt on object
CN113421248A (en) * 2021-06-30 2021-09-21 上海申瑞继保电气有限公司 Transformer substation equipment rotation image numerical value processing method
CN113421248B (en) * 2021-06-30 2024-02-09 上海申瑞继保电气有限公司 Substation equipment rotating image numerical value processing method
CN116630203A (en) * 2023-07-19 2023-08-22 科大乾延科技有限公司 Integrated imaging three-dimensional display quality improving method
CN116630203B (en) * 2023-07-19 2023-11-07 科大乾延科技有限公司 Integrated imaging three-dimensional display quality improving method

Also Published As

Publication number Publication date
CN104574331B (en) 2019-03-08
CN104574331A (en) 2015-04-29

Similar Documents

Publication Publication Date Title
WO2014180255A1 (en) Data processing method, apparatus, computer storage medium and user terminal
JP6271609B2 (en) Autofocus for stereoscopic cameras
US11838606B2 (en) Methods and systems for large-scale determination of RGBD camera poses
Tao et al. Depth from combining defocus and correspondence using light-field cameras
Kalantari et al. Learning-based view synthesis for light field cameras
US20190028631A1 (en) Auto-Focus Method and Apparatus and Electronic Device
Ahn et al. A novel depth-based virtual view synthesis method for free viewpoint video
US9338437B2 (en) Apparatus and method for reconstructing high density three-dimensional image
CN106981078B (en) Sight line correction method and device, intelligent conference terminal and storage medium
JP5015126B2 (en) Image generation method, image authentication method, image generation apparatus, image authentication apparatus, program, and recording medium
Chen et al. Variational fusion of time-of-flight and stereo data for depth estimation using edge-selective joint filtering
TW201227602A (en) Method and computer-readable medium for calculating depth of image
CN110120012B (en) Video stitching method for synchronous key frame extraction based on binocular camera
CN111383255A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
Wang et al. Example-based video stereolization with foreground segmentation and depth propagation
CN108062765A (en) Binocular image processing method, imaging device and electronic equipment
Lee et al. Estimating scene-oriented pseudo depth with pictorial depth cues
Orozco et al. HDR multiview image sequence generation: Toward 3D HDR video
KR101718309B1 (en) The method of auto stitching and panoramic image genertation using color histogram
Sariga et al. Steganographic data hiding in automatic converted 3D image from 2D and 2D to 3D video conversion
Li et al. Scalable light field disparity estimation with occlusion detection
Mansoor et al. A Layered Approach for Quality Assessment of DIBR‐Synthesized Images
KR102722899B1 (en) Methods and systems for large-scale determination of RGBD camera poses
Huang et al. Learning stereoscopic visual attention model for 3D video
Hua et al. Adaptive 3D content generation via depth video based rendering

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14795341

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14795341

Country of ref document: EP

Kind code of ref document: A1