WO2009089785A1 - Procédé de traitement d'image, procédé de codage/décodage et appareil - Google Patents

Procédé de traitement d'image, procédé de codage/décodage et appareil Download PDF

Info

Publication number
WO2009089785A1
WO2009089785A1 PCT/CN2009/070069 CN2009070069W WO2009089785A1 WO 2009089785 A1 WO2009089785 A1 WO 2009089785A1 CN 2009070069 W CN2009070069 W CN 2009070069W WO 2009089785 A1 WO2009089785 A1 WO 2009089785A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
updated
view
pixel
pixel value
Prior art date
Application number
PCT/CN2009/070069
Other languages
English (en)
Chinese (zh)
Inventor
Yun He
Gang Zhu
Ping Yang
Xiaozhong Xu
Jianhua Zheng
Xiaozhen Zheng
Shujuan Shi
Original Assignee
Huawei Technologies Co., Ltd.
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd., Tsinghua University filed Critical Huawei Technologies Co., Ltd.
Publication of WO2009089785A1 publication Critical patent/WO2009089785A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • Image processing method, codec method and device The present application claims the priority of the Chinese application filed on January 11, 2008, with the application number of 200810056088. 3.
  • the invention titled "an image processing method, codec method and device” The entire contents of which are incorporated herein by reference.
  • the present invention relates to the field of digital media technologies, and in particular, to an image processing method, a codec method, and an apparatus.
  • Embodiments of the present invention provide an image processing method, a codec method, and a device, so that the performance of video encoding and decoding can be improved.
  • An image processing method includes:
  • At least one view image is updated based on the at least two view images, camera parameter information, and depth information of the object.
  • An image processing apparatus includes: An image obtaining unit, configured to acquire at least two views of the image;
  • a parameter obtaining unit configured to acquire depth information of the object and camera parameter information of each view
  • the updating unit performs update processing on the at least one view image according to the camera parameter information acquired by the parameter acquisition unit and the depth information of the object, and the at least two view images acquired by the image acquisition unit.
  • An encoding method including:
  • the output image is used as a reference image for encoding operation.
  • An encoding device comprising:
  • a reference image obtaining unit configured to acquire a reference image that encodes an image of the current view
  • the image processing device is configured to process the reference image acquired by the reference image acquiring unit to obtain an output image
  • a coding unit configured to perform an encoding operation by using an output image provided by the image processing unit as a reference image in encoding the current view image.
  • a decoding method including:
  • the output image is used as a reference image for decoding operation.
  • a decoding device comprising:
  • a reference image obtaining unit configured to acquire a reference image for decoding an image of the current view
  • the image processing device is configured to process the reference image acquired by the reference image acquiring unit to obtain an output image
  • a decoding unit configured to perform a decoding operation by using an output image provided by the image processing unit as a reference image in the process of decoding the currently viewed image.
  • a method for implementing image upsampling comprising:
  • Pixels of the entire pixel position in the image are retained in the output image, and the sub-pixel position in the image is processed by the image processing method to obtain a pixel value corresponding to the sub-pixel position;
  • the pixel value corresponding to the obtained sub-pixel position is added to the output image to obtain an image after the upsampling process.
  • An apparatus for implementing image upsampling comprising:
  • An image obtaining unit configured to acquire an image that needs to be subjected to upsampling processing
  • a whole pixel processing unit for retaining pixels of an entire pixel position in an image acquired by the image acquisition unit in an output image
  • the image processing device is configured to perform processing on the sub-pixel position in the image acquired by the image acquiring unit to obtain a pixel value corresponding to the sub-pixel position;
  • a sampling result generating unit configured to add a pixel value corresponding to the sub-pixel position obtained by the image processing device to the output image obtained by the whole pixel processing unit, to obtain an image after the upsampling process.
  • FIG. 1 is a schematic diagram of an implementation principle of an embodiment of the present invention
  • FIG. 2 is a schematic diagram of an application scenario according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a camera moving process according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an encoding apparatus according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of a decoding apparatus according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic structural diagram of an apparatus for implementing image upsampling according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of upsampling in an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of down sampling in an embodiment of the present invention.
  • the present invention is embodied in, for example, FIG. 1 , and can specifically utilize the depth information of an object and the camera parameters of each view.
  • the number information is used to perform an update process such as upsampling or downsampling on the at least one view image to obtain an updated image that meets the predetermined requirement; for example, the updated image can satisfactorily meet the needs of the inter-view prediction, or
  • the updated image can be made to meet the display needs of different spatial resolutions, or to convert between images before and after camera movement. Referring to FIG.
  • the corresponding input image may be an updated reference image
  • the corresponding output image may be an updated image
  • the updated image is referred to as an image to be updated before being updated.
  • the input image may be one view image or multiple images corresponding to multiple views.
  • the output image may be one view image (one-to-one, or many-to-one), or may be multiple view images (multiple For many), that is, the image currently to be updated can obtain information from the input view image to update the image to be updated. If it is a many-to-many or many-to-one situation, the view to be updated may be any view in the input view, or may be other views.
  • the image to be updated may also be an image in which the content in the image may change. Specifically, the pixel value of the partial point or the image block in the image may change but the updated image size is consistent with the image size before the update.
  • the complexity of multi-view coding is greatly reduced.
  • the size may be the original MxN, or it may be M/m x N/n after downsampling, m, n is a natural number.
  • the current multi-view coding method does not perform any processing on the reference picture and is directly used as a reference.
  • the depth information of the object refers to information capable of providing or capable of deriving the distance between the space object and the camera; for example, an 8-bit depth map may be used to represent the depth information of the object, which may be:
  • the quantization method converts the distance between the space object and the camera into an integer between 0 and 255, and is represented by an 8-bit binary number, and each view corresponds to a depth map; and the corresponding depth information may be the original depth information. It can also be the depth information after the code is reconstructed.
  • the depth information of the object can be obtained from actual measurements or estimated by an algorithm.
  • the camera parameters include: external parameters (Extrinsic parameters), internal Intrins ic parameters or Sensor plane parameters.
  • the external parameters include a rotation matrix R, a translation matrix T, internal parameters including focal length (Focal ength), distortion parameters (such as radial distortion), and optical translation, and optical plane parameters include vertical
  • the camera module ⁇ ⁇ 1 type involves concepts such as the world coordinate system, the camera coordinate system and the imaging plane.
  • the world coordinate system refers to a three-dimensional coordinate system in which a previously defined three-dimensional point in a three-dimensional space is used as a coordinate system origin, and a previously defined direction is an X-axis (or a Y-axis, and a Z-axis is also possible);
  • a camera coordinate system Refers to the optical center of the camera (or camera) as the origin, usually the optical axis is the Z-axis three-dimensional coordinate system;
  • the optical plane refers to the optical imaging plane of the camera (or camera), usually coincides with the XY plane of the camera coordinate system , but the origin of its coordinate system does not necessarily coincide with the origin of the camera coordinate system.
  • the rotation matrix R in the corresponding camera parameters reflects the rotation relationship between the world coordinate system and the camera coordinate system.
  • the matrix R contains three components Rx, Ry and Rz, where Rx is the world coordinate system and the camera coordinate system is on the X axis.
  • the rotation angle, Ry is the rotation angle of the world coordinate system and the camera coordinate system on the Y axis, and Rz is the rotation angle of the world coordinate system and the camera coordinate system on the Z axis.
  • the corresponding translation matrix T reflects the translation relationship between the world coordinate system and the camera coordinate system, which includes three components Tx, Ty and ⁇ z, where Tx is the translation amount of the world coordinate system and the camera coordinate system on the X axis, Ty is The amount of translation of the world coordinate system and the camera coordinate system on the Y axis, and Tz is the amount of translation of the world coordinate system and the camera coordinate system on the Z axis. Since the ordinary camera (or camera) does not completely match the unit length of each pixel point of the X-axis and the ⁇ axis of the imaging plane when imaging, to ensure the calculation accuracy, the focal length fx based on the imaging plane X-axis and the imaging-based imaging are introduced. The focal length fy of the plane axis.
  • the distortion parameter in the camera parameters can be expressed as s, which is usually present in a camera with an optical sphere (or in the camera).
  • the offset between the optical imaging plane origin and the camera coordinate system XY plane origin is denoted by ⁇ and py.
  • the camera's internal and optical parameters can be represented together as the camera's internal parameter matrix K.
  • R R x R y R z
  • R r are the rotation matrix of the world coordinate system around the x, y, and z axes, respectively, and its expression can be: 1 0 0
  • R— sin ⁇ ⁇ cos ⁇ ⁇ 0
  • Embodiments of the present invention can be implemented in accordance with the camera parameters described above.
  • the reason for the parallax vector between the two views at the same time is the difference between the two camera parameters, such as the position of the camera and the internal parameters of the camera.
  • the mapping position of a certain point of the current view in other views can be obtained by stereo imaging and projection principle, and then the pixel values of other corresponding positions can be obtained to achieve
  • the filtered (ie, updated) process of the image yields the desired output image.
  • the image of the input view is used as the image for updating the reference view, and the image to be output is used as the image to be updated.
  • the image to be output is used as the image to be updated.
  • the updating process includes: calculating three-dimensional coordinates of the object according to the camera parameter information, the depth information of the object, and the two-dimensional coordinates of the object in the output view image; using the camera parameter information and the The three-dimensional coordinates calculate the two-dimensional coordinates of the object in the input view image, and determine the pixel value corresponding to the two-dimensional coordinate; and then determine the corresponding two-dimensional coordinates of the object in the output view image according to the pixel value corresponding to the two-dimensional coordinate The pixel value is updated, and the pixel value of the corresponding two-dimensional coordinate in the output view image is updated to obtain the processed output image.
  • the corresponding output image may be, but is not limited to, a prediction between a view image (i.e., an image of a different view) and a prediction between different images in the same view.
  • the size of the output image before the update can also be expanded to match the size of the updated output image.
  • the method may include: calculating a scaling factor of the size of the output image after the current update and the size of the output image before updating, multiplying the existing pixel point coordinates in the image before the update by the scaling factor, and corresponding pixel points in the output image before updating The pixel value is assigned to the calculated image in the extended image The location corresponding to the new coordinates. If the calculated coordinate value is not an integer, the fractional part of the coordinate value may be removed or rounded up by the rounding rule.
  • the output image size before the update is MxN
  • the updated output image is aMxbN
  • the ratio of the horizontal and vertical directions of the output image after updating and updating is £ ⁇ nb. If the coordinates of one point on the output image before update are (u, v) and the corresponding pixel value is G, the pixel value of the point where the coordinates are (gu, gv) in the expanded image is G, if gu or gv is not Integers can be rounded off or the fractional part can be omitted directly.
  • the size of the output image before the update can be expanded to match the size of the other view images.
  • the method may include: calculating a scaling factor of the size of the other view image and the size of the output image before the update, multiplying the existing pixel point coordinates in the image before the update by the scale factor, and updating the pixel of the corresponding pixel in the output image before updating.
  • the value is assigned to the position corresponding to the new coordinate calculated in the extended image. If the calculated coordinate value is not an integer, the fractional part of the coordinate value may be removed or rounded off.
  • the ratio of the other view images to the horizontal and vertical directions of the output image before update is respectively £ ⁇ nb. If the coordinates of one point on the output image before update are (u, v) and the corresponding pixel value is G, the pixel value of the point where the coordinates are (gu, gv) in the expanded image is G, if gu or gv is not Integers can be rounded off or the fractional part can be omitted directly.
  • the three-dimensional coordinates of the above calculated object reflect the actual distance between the points of the object in three-dimensional space or the approximate distance in three-dimensional space.
  • the above three-dimensional coordinates may also be the actual or approximate position of the point of the object in three-dimensional space.
  • the three-dimensional space referred to here can be expressed as the aforementioned world coordinate system, and can also be expressed as a mathematically used three-dimensional coordinate system.
  • the depth information of the above object may be the actual distance of each point of the object in the three-dimensional space from the camera; or may be an approximate value of the distance of each point of the object in the three-dimensional space.
  • the depth information of the object can be expressed as a depth map with a gray value of 0-255, which can be the same size as the view image.
  • the gray value of each point in the depth map reflects the distance from the camera in the three-dimensional space corresponding to the corresponding point in the image. For example, a point with a gray value of 255 in the depth map may represent that the point in the three-dimensional space corresponding to the corresponding point in the view image is closest to the camera; the point with the gray value of 0 in the depth map may represent the corresponding image in the image.
  • the point corresponding to the point in the three-dimensional space is farthest from the camera.
  • the above update processing process may be used to calculate the pixel values of all points or image blocks of the object in the output view image, or may also be used to calculate the pixel values of the partial points or partial image blocks of the object in the output view image.
  • the corresponding point or partial image block may be taken from the output view image in an equidistant interval manner, that is, The abscissa or column coordinates of the points in the output view image are calculated at a certain point or image block, ie, a point or an image block, using the corresponding update process to calculate the corresponding pixel value.
  • the two-dimensional coordinates of the image block may be two-dimensional coordinates corresponding to the upper left corner of the image block;
  • the three-dimensional coordinates of the image block may be the coordinate values corresponding to the upper left corner of the image block in the three-dimensional space coordinate system.
  • the camera parameter may be a parameter in a universal pinhole camera model, or may be a parameter in other common camera models.
  • the depth information of the object in the embodiment of the present invention may be represented as a gray scale image with a pixel value of 0 to 255, and each point in the gray scale corresponds to depth information of each point or each image block in the view image.
  • the pixel value in the gray scale is 0 (or 255), indicating that the corresponding point or image block is closest to the camera in the three-dimensional space; the pixel value in gradation is 255 (or 0) indicating the corresponding point or The position of the image block in the three-dimensional space is farthest from the camera.
  • the distance of the corresponding point in the three-dimensional space from the camera can be expressed as the Euclidean distance, or the distance measurement unit commonly used in mathematics.
  • the three-dimensional coordinates in the embodiment of the present invention correspond to the coordinate values of the object in a coordinate system, which is the coordinate system determined before the encoding; the corresponding three-dimensional coordinates are reflected between the majority points in the image. Approximate three-dimensional spatial positional relationship.
  • update processing procedure for the input image may specifically include the following steps:
  • Step 1 Read the camera parameters of each view.
  • Step 2 calculating a mapping matrix of each view by using camera parameters of each view
  • the mapping matrix is a mapping conversion coefficient between a two-dimensional coordinate and a three-dimensional coordinate of each point in each view image corresponding to each view, that is, a certain point in the known view image can be obtained by the mapping matrix (the object corresponds to the object)
  • the two-dimensional coordinates of the point) and a component of the three-dimensional coordinates of the object corresponding to the point in the space can be calculated to obtain other components of the three-dimensional coordinates, or the map matrix can be corresponding to a certain point in the known view image.
  • the two-dimensional coordinate value of the corresponding point in the visual image can be calculated;
  • the lower two-dimensional coordinates are converted into spatial three-dimensional coordinates;
  • Step 4 using the mapping matrix of the input view, converting the spatial three-dimensional coordinates of the above points into two-dimensional coordinates under the input image;
  • Step 5 assign a pixel value corresponding to the two-dimensional coordinate in the input image to a point of the corresponding coordinate of the output image or use the calculated pixel value to calculate a pixel value of a point of the corresponding coordinate of the output image to obtain a desired output.
  • the pixel value of a certain point of the view is assigned to the pixel value of the corresponding coordinate point in the output image
  • the input image is a plurality of view images
  • the pixel values of the corresponding points of the plurality of views may be weighted and averaged or mathematically operated, and the weighted average or the mathematically calculated value is assigned to the pixel value of the corresponding coordinate point in the output image
  • the mathematical operations described may be a combination of mathematical operations such as addition, subtraction, multiplication, division, shifting, and exponentiation.
  • the pixel value of a certain point in the output image may be, for example, weighted and averaged for the pixel values of the surrounding points, or other calculation methods may be employed.
  • the current pixel (ul, vl) is calculated to give it a corresponding position in the reference image (u2, v2).
  • u2, v2 may not necessarily be an integer, so u2, v2 need to be rounded up to the nearest integer pixel or subpixel by a certain rounding. If the corresponding position of (u2, v2) has exceeded the image boundary, then some adjustments need to be made to u2, v2. Except that (u2, v2) corresponds to an integer pixel and is in the image, in other cases, a certain calculation rule is required to obtain (u2, v2) the reference pixel value of the pointing position.
  • the above processing method saves a large amount of computational overhead by directly assigning pixel values of adjacent view images to the current point after simple mathematical operations, for example, saving computational overhead of transform, inverse transform, quantization, and inverse quantization modules.
  • the current image is down-sampled or up-sampled, the number of coded bits required for the current view image is greatly reduced, thereby improving the coding efficiency.
  • the process of calculating the pixel value of the point or the image block in the output view image and obtaining the output image may specifically include any one of the following implementations:
  • the image of the input view is processed according to the depth information and the known camera parameters, and the image is obtained close to the output.
  • the image of the image and is applied to the inter-view prediction and the decoded display.
  • the corresponding processing at the encoding end includes:
  • Determining the depth information of the object in the multi-view image encoding process and calculating, according to the depth information, and the known camera parameters, obtaining corresponding positions of the respective pixels of the output view in the input view, wherein - if the corresponding When the position is on the whole pixel of the input view image, the pixel value of the input corresponding point is directly assigned to the corresponding pixel of the output view, and if there are multiple input views, the pixel values of the corresponding input are weighted by the corresponding point. An average or mathematical operation is assigned to the corresponding pixel of the output view;
  • the pixel value of the corresponding position is calculated by using the pixel value of the surrounding pixel point.
  • the output image of the filter used for the update process will be used as the reference view for the inter-view prediction; for example, if the input of the filter is viewed as view 1 and view 2, the output is viewed as view 3 and is being encoded.
  • the current view is view 3, and the output image of the filter is used as the reference view of the current view for the next inter-view prediction and encoding.
  • the corresponding processing at the decoding end includes:
  • the same filtering operation as the encoding end is also performed, taking 1 and 2 as the input view, and 3 as the output view, and obtaining the output image according to the depth information of the object and the camera parameters of each view, and outputting the image.
  • the decoding operation is performed on the view 3, and finally the image reconstructed by the view 3 is obtained.
  • the embodiment of the present invention can also be applied to the decoding end to process the corresponding input image to obtain a desired output image, and the corresponding processing includes:
  • Step 1 Receive a code stream, parse the depth information of the acquired object, and each camera parameter, and reconstruct each view Image.
  • Step 2 to reconstruct an image of some of the obtained visual images, and the depth information of the object and each camera parameter as input of the filter, and the corresponding output view may be an existing view, or may be There is no decoded virtual view, where:
  • the corresponding pixel position of the output view is calculated in the corresponding position of the input view: if the corresponding position is in the input view image At the pixel, the pixel value of the input corresponding point is directly assigned to the corresponding pixel of the output view, and if there are multiple input views, the pixel values of the corresponding input are weighted and averaged and then assigned to the output view.
  • Corresponding pixel points if the corresponding pixel is not found in the input view, or is not at the entire pixel position, the pixel value of the corresponding pixel is calculated by the pixel value of the surrounding pixel; for the output to be regarded as the virtual view, the corresponding The output image can be used as a reference image; for the view that the output is considered to be already existing, the decoding end has reconstructed the image of the view. In this case, if the view image needs to be upsampled, the full image of the view image needs to be obtained.
  • the pixel value of the pixel in the middle position for example, the white point position in FIG.
  • the process of processing the pixel value includes: calculating, according to the depth information of the object, and the known camera parameters, obtaining the output corresponding position of the point in the input view, and if the corresponding position is at the entire pixel of the input view image, inputting The pixel value of the corresponding point is assigned to the corresponding pixel of the output view.
  • the plurality of inputs are weighted and averaged according to the pixel value of the corresponding point or mathematically operated, and then assigned to the corresponding pixel of the output view; If the input point cannot find the corresponding point, or if it is not at the entire pixel position, the pixel value of the corresponding position is calculated by using the pixel value of the surrounding pixel point.
  • Step 3 Obtain an image of the output view for display of the multi-view image.
  • B is regarded as the current view to be encoded, and may be referred to as a current view or a code view, and A is regarded as a reference view of the B view.
  • the coding order can be: left to right, top to bottom.
  • A is regarded as the filter input view
  • B is regarded as the output view of the filter.
  • the corresponding implementation process may specifically include:
  • Step 31 Obtain respective camera parameters, and use the camera parameters to obtain a mapping matrix corresponding to the input view and the output view, so as to use the mapping matrix to perform spatial three-dimensional coordinates of the object and corresponding two images under each view image. Conversion of dimensional coordinates;
  • mapping matrix ⁇ ⁇ ⁇ ⁇ ⁇ ; where ⁇ is the internal parameter matrix of the camera and R is the rotation parameter matrix.
  • is the translation parameter matrix
  • Step 32 using the depth information of the object to obtain a ⁇ component in the three-dimensional coordinates of the object;
  • the depth information of an object is only a representation of the actual depth of the object, it is not directly the actual depth of the object, so it needs to be transformed to obtain the actual depth of the object; for example, if the minimum and maximum values of the object depth are quantified as 256 levels, represented by 8 bits, are called depth information of the object. Therefore, the depth information of the object needs to be inversely processed correspondingly to convert the physical depth information into the actual depth of the object, that is, the z in the three-dimensional coordinates. Component.
  • Step 33 using the mapping matrix obtained in step 31 and the information of the z component in the three-dimensional coordinates corresponding to the object obtained in step 32, obtaining the X and y components in the three-dimensional coordinates of the object;
  • the X and y components in the three-dimensional coordinates of the object can be solved according to the three-dimensional projection principle, namely: ul
  • mapping matrix ⁇ ul, vl ⁇ is the coordinate of the object on the imaging plane of the output view, and ⁇ x, y, z ⁇ is the three-dimensional coordinate of the object in space.
  • Step 34 using the mapping matrix of the input view obtained in step 31 and the spatial three-dimensional coordinates of the objects obtained in steps 32 and 33, obtaining corresponding two-dimensional coordinates of the object on the input visual image;
  • the matrix), ⁇ u, v ⁇ is the two-dimensional coordinates of the object on the imaging plane of the input view, by which the corresponding ⁇ u, v ⁇ can be solved;
  • Step 35 Determine, according to the pixel value corresponding to the corresponding ⁇ u, v ⁇ , a pixel value of the coordinate in the output view image as ⁇ ul, vl ⁇ , where specifically:
  • u and V obtained in step 34 are integers, the corresponding pixel value in the input view image pointed by the coordinate ⁇ u, v ⁇ can be directly used as the pixel value of the output view image with coordinates ⁇ ul, vl ⁇ ;
  • weighted average calculation or mathematical operation is performed on a plurality of pixels whose position coordinates ⁇ u, v ⁇ are in accordance with a predetermined requirement, and the calculated weighted average calculation or The result of the mathematical operation is assigned to the pixel whose coordinates are ⁇ ul, vl ⁇ in the output view image; wherein the predetermined requirement may be the closest distance or the distance may be less than a predetermined value, etc., and the corresponding plurality of pixels may be 4 pixels or
  • the pixel values of all the pixels of the output view image can be obtained by repeating steps 32 to 35 by 6 pixels or 8 steps 36, and the obtained output view image is used as the reference image of the current coded view, and the next step is performed. Prediction and encoding operations.
  • the corresponding decoding process specifically includes:
  • Step 37 Parsing the encoded code stream to obtain camera parameters and depth information of the object
  • Step 38 parsing the encoded code stream, obtaining the reconstructed image of the input view A, and treating the reconstructed image as the output view; and using step 39 to obtain the mapping matrix of each view;
  • Step 310 Using the depth information of the object, obtaining a z component of the three-dimensional coordinate of the object;
  • Step 311 using the mapping matrix obtained in step 39 and the z component of the three-dimensional coordinates of the object obtained in step 310, obtaining the X and y components of the three-dimensional coordinates of the object;
  • mapping matrix ⁇ ul, vl ⁇ is the coordinate of the object on the imaging plane of the output view, and ⁇ x, y, z ⁇ is the three-dimensional coordinate of the object in space.
  • Step 312 using the mapping matrix obtained in step 39 and the spatial three-dimensional coordinates of the objects obtained in steps 310 and 311, obtaining corresponding coordinates ⁇ u, v ⁇ of the object on the input visual image;
  • Step 313 Determine, according to the corresponding pixel value in the input view image pointed to by ⁇ u, v ⁇ , a pixel value of the coordinate in the output view image as ⁇ ul, vl ⁇ , where the method may include:
  • step 312 If u and V obtained in step 312 are integers, the corresponding pixel value in the input view image pointed by the coordinates ⁇ u, v ⁇ is directly used as the pixel value of the coordinate ⁇ u, vl ⁇ in the output view image;
  • step 312 weighted average calculation or mathematical operation is performed on a plurality of pixels whose position coordinates ⁇ u, v ⁇ are in accordance with a predetermined requirement, and the calculated weighted average calculation or The result of the mathematical operation is assigned to the pixel whose coordinates are ⁇ ul, vl ⁇ in the output view image; wherein the predetermined requirement may be the closest distance or the distance may be less than a predetermined value, etc., and the corresponding plurality of pixels may be 4 pixels or Steps 310 to 313 can be repeatedly performed for 6 pixels or 8 images, and the pixel values of all the pixels of the output view image can be obtained. If the obtained output view image is used as the reference image to be decoded, the next step can be decoded. operating.
  • A is considered to be the input view of the filter
  • the B view image size is MxN
  • an image that is upsampled to a size of 2Mx2N is required as the output view of the filter.
  • the implementation process of the corresponding multi-view decoding display may specifically include:
  • Step 41 Acquire each camera parameter, and obtain a mapping matrix of each view by using the camera parameter, and further parse the code stream to obtain a reconstructed image of each view, wherein the reconstructed image of A view is used as an input of the filter, and B is regarded as filtering. Output of the device;
  • Step 42 Using the obtained depth information of the object, obtain the z component of the three-dimensional coordinate of the object in space; Step 43, using the mapping matrix obtained in step 41 and the z component of the three-dimensional coordinates of the object obtained in step 42 The X and y components of the three-dimensional coordinates of the object in space; Step 44: For the output view image, the pixels of the entire pixel position are reserved; for the pixels of the sub-pixel position ⁇ ul, vl ⁇ , the position is obtained by using the mapping matrix obtained in step 41 and the three-dimensional coordinates of the objects obtained in steps 42 and 43 in space. Corresponding coordinates ⁇ u, v ⁇ on the input view image;
  • the integer pixel position refers to the pixel position of the discrete position in the image before the up-sampling is a whole pixel, and the sub-pixel position means that the point in the middle of the two integer pixel points is called a half pixel point.
  • the point at the center of the two half-pixel points is called a 1/4-precision sub-pixel point;
  • Step 45 Determine, according to the corresponding pixel value in the input view image pointed to by ⁇ u, v ⁇ , a pixel value of the coordinate in the output view image as ⁇ ul, vl ⁇ , where specifically:
  • step 44 If u and V obtained in step 44 are integers, the corresponding pixel value in the input view image pointed by the coordinate ⁇ u, v ⁇ is directly used as the pixel value of the coordinate ⁇ u, vl ⁇ in the output view image;
  • weighted average calculation or mathematical operation is performed on a plurality of pixels whose position coordinates ⁇ u, v ⁇ are in accordance with a predetermined requirement, and the calculated weighted average calculation or The result of the mathematical operation is assigned to the pixel whose coordinates are ⁇ ul, vl ⁇ in the output view image; wherein the predetermined requirement may be the closest distance or the distance may be less than a predetermined value, etc., and the corresponding plurality of pixels may be 4 pixels or Can be 6 pixels or 8 pixels, etc.;
  • step 46 steps 44 and 45 are repeated until the pixel values of all the sub-pixel points of the output view image are obtained, and then the obtained output view image having the size of 2Mx2N is displayed.
  • the B image is an image taken when the camera position is moved to position 2, which is called a coded image.
  • the A image is an image taken when the camera is at position 1, and is called a reference image.
  • the B image is encoded in units of blocks.
  • the B image has R blocks, the encoding order is from left to right, from up to under.
  • the A image is a filter input image
  • the output image of the filter is an output image that is close to the B image.
  • the corresponding processing may specifically include:
  • Step 51 Obtain camera parameters of the camera at each position, and use the camera parameters to obtain a mapping matrix corresponding to each position;
  • mapping matrix ⁇ ⁇ ⁇ ⁇ ⁇ , where ⁇ is the camera internal parameter matrix and R is the rotation parameter matrix,
  • is the translation parameter matrix
  • Step 52 using the depth information of the object, to obtain the ⁇ component of the three-dimensional coordinates of the object in space;
  • Step 53 using the mapping matrix obtained in step 51 and the three-dimensional coordinates of the object obtained in step 52 in space :, obtaining the x and y components of the three-dimensional coordinates of the object in space;
  • Utput is the projection matrix of position 2 found in step 51 (ie 1
  • mapping matrix ⁇ ul, vl ⁇ is the coordinate of the object on the image plane of the B image, and ⁇ x, y, z ⁇ is the three-dimensional coordinate of the object in space.
  • Step 54 using the mapping matrix of position 1 obtained in step 51 and the three-dimensional coordinates of the objects obtained in steps 52, 53 to obtain corresponding two-dimensional coordinates of the object on the A image;
  • mapping matrix (ie mapping matrix), ⁇ u, v ⁇ is the two-dimensional coordinates of the object on the imaging plane of the input view;
  • Step 55 assign a corresponding pixel value in the input view image pointed by the coordinate ⁇ u, v ⁇ to the pixel with the coordinate ⁇ ul, vl ⁇ in the output view image;
  • u and V obtained in step 54 are integers, the corresponding pixel values in the input view image pointed by the coordinates ⁇ u, v ⁇ are directly assigned to the pixels in the output view image with coordinates ⁇ ul, vl ⁇ ;
  • weighted average calculation is performed on a plurality of pixels whose position coordinates ⁇ u, v ⁇ are in accordance with a predetermined requirement, and the calculated weighted average calculation result is assigned to the output.
  • Step 56 repeating steps 52 to 55 until the pixel values of all the pixels of the output view image are obtained, and the corresponding output view image can be obtained, and the output view image can be used as the reference image of the current coded view for the next step. Prediction and coding.
  • the corresponding decoding process may specifically include: Step 57, parsing the encoded code stream, and obtaining camera parameters corresponding to the camera at various positions and depth information of the object;
  • Step 58 Parsing the encoded code stream to obtain a reconstructed image of the A image, and using the A image as a filter input image, the current decoded image is a B image, and the filter output is an image close to the B original image, and the output image may be used as a current image. Decoding a reference image of the image;
  • Step 59 Determine, by using each camera parameter obtained in step 57, a mapping matrix corresponding to each position of the camera.
  • Step 510 Determine, by using depth information of the object, a z component of the three-dimensional coordinate of the object in space;
  • Step 511 use the mapping obtained in step 59.
  • the matrix and the z component of the three-dimensional coordinates of the object obtained in step 510 are obtained, and the X and y components of the three-dimensional coordinates of the object in space are obtained;
  • the matrix (ie, the mapping matrix), ⁇ ul, vl ⁇ is the coordinates of the object on the B image plane, and ⁇ x, y, z ⁇ is the three-dimensional coordinates of the object in space.
  • Step 512 using the mapping matrix obtained in step 59 and the spatial three-dimensional coordinates of the objects obtained in steps 510 and 511, obtaining corresponding coordinates ⁇ u, v ⁇ of the object on the input visual image;
  • Step 513 Calculate, according to the corresponding pixel value in the input view image pointed by the coordinate ⁇ u, v ⁇ , a pixel value of the coordinate in the output view image as ⁇ ul, vl ⁇ , where the specific one may be:
  • step 512 If u and V obtained in step 512 are integers, the corresponding pixel value in the input view image pointed by the coordinates ⁇ u, v ⁇ is directly used as the pixel value of the coordinate ⁇ u, vl ⁇ in the output view image;
  • step 512 If u and V obtained in step 512 are not all integers, weighted average calculation or mathematical operation is performed on a plurality of pixels whose position coordinates ⁇ u, v ⁇ are pointed to meet predetermined requirements, and the calculated weighted average calculation or The result of the mathematical operation is assigned to the pixel whose coordinates are ⁇ ul, vl ⁇ in the output view image; wherein the predetermined requirement may be the closest distance or the distance may be less than a predetermined value, etc., and the corresponding plurality of pixels may be 4 pixels or Steps 510 to 513 may be repeatedly performed in a 6-pixel or 8-pixel step 514, and pixel values of all pixel points of the output view image may be obtained, thereby obtaining a corresponding output view image, which may be the current view image to be decoded. Reference image for use Perform the next decoding operation to improve decoding efficiency and performance.
  • A is considered to be the input view of the filter
  • the B view image size is MxN
  • an image that is upsampled to a size of 2Mx2N is required as the output view of the filter.
  • the implementation process of the corresponding multi-view decoding display may specifically include:
  • Step 61 Acquire each camera parameter, and obtain a mapping matrix of each view by using the camera parameter, and further parse the code stream to obtain a reconstructed image of each view, where the reconstructed image of the A view is used as the input of the filter, and B is regarded as the filter. Output of the device;
  • Step 62 using the obtained depth information of the object, to obtain the z component of the three-dimensional coordinates of the object in space;
  • Step 63 using the mapping matrix obtained in step 41 and the z component of the three-dimensional coordinates of the object obtained in step 42 The X and y components of the three-dimensional coordinates of the object in space;
  • Step 64 For the output view image, the pixels of the entire pixel position are reserved; for the pixels of the sub-pixel position ⁇ ul, vl ⁇ , the position is obtained by using the mapping matrix obtained in step 41 and the three-dimensional coordinates of the objects obtained in steps 42 and 43 in space. Corresponding coordinates ⁇ u, v ⁇ on the input view image;
  • the integer pixel position refers to the pixel position of the discrete position in the image before the up-sampling is a whole pixel, and the sub-pixel position means that the point in the middle of the two integer pixel points is called a half pixel point.
  • the point at the center of the two half-pixel points is called a 1/4-precision sub-pixel point;
  • Step 65 Calculate, according to the corresponding pixel value in the input view image pointed to by ⁇ u, v ⁇ , a pixel value of the coordinate ⁇ ul, vl ⁇ in the output view image, where specifically:
  • step 64 If u and V obtained in step 64 are integers, the corresponding pixel value in the input view image pointed by the coordinate ⁇ u, v ⁇ is directly used to calculate the pixel value of the coordinate ⁇ u, vl ⁇ in the output view image. ;
  • the weighted average calculation is performed on a plurality of pixels whose position coordinates ⁇ u, v ⁇ are in accordance with a predetermined requirement, and the calculated weighted average or mathematical operation result is used.
  • the predetermined requirement may be the closest distance or the distance may be less than a predetermined value, etc.
  • the corresponding plurality of pixels may be 4 pixels or 6 Pixel or 8 pixels, etc.; the above calculation result is recorded as the value 1;
  • Step 66 Obtain a pixel value of an adjacent point of the output image in the coordinate ⁇ ul, vl ⁇ , perform weighted averaging or mathematical operation on the obtained pixel value, and calculate the result as a value of 2;
  • Step 67 performing weighted averaging or mathematical calculation on the value 1 and the value 2, and using the calculation result as the pixel value of the coordinate ⁇ ul, vl ⁇ in the output image;
  • steps 64, 65, 66, 67 are repeated until the pixel values of all sub-pixel points of the output view image are obtained, and then the resulting output view image having a size of 2Mx2N is displayed.
  • B is regarded as the current view to be encoded, and may be referred to as a current view or a code view, and A is regarded as a reference view of the B view.
  • the coding order can be: left to right, top to bottom.
  • A is regarded as the filter input view
  • B is regarded as the output view of the filter.
  • the corresponding implementation process may specifically include:
  • Step 71 obtaining respective camera parameters, and using the camera parameters to obtain a mapping matrix corresponding to the input view and the output view, for using the mapping matrix to perform spatial three-dimensional coordinates of the object and corresponding two-dimensional coordinates under the respective view images.
  • mapping matrix ⁇ ⁇ ⁇ ⁇ ⁇ ; where ⁇ is the internal parameter matrix of the camera and R is the rotation parameter matrix.
  • is the translation parameter matrix
  • Step 72 using the depth information of the object to obtain a ⁇ component in the three-dimensional coordinates of the object;
  • the depth information of an object is only a representation of the actual depth of the object, it is not directly the actual depth of the object, so it needs to be transformed to obtain the actual depth of the object; for example, if the minimum and maximum values of the object depth are quantified as 256 levels, represented by 8 bits, are called depth information of the object. Therefore, the depth information of the object needs to be inversely processed correspondingly to convert the physical depth information into the actual depth of the object, that is, the z in the three-dimensional coordinates. Component.
  • Step 73 Obtain the X and y components in the three-dimensional coordinates of the object by using the mapping matrix obtained in step 71 and the information of the z component in the three-dimensional coordinates corresponding to the object obtained in step 72;
  • the X and y components in the three-dimensional coordinates of the object can be solved according to the three-dimensional projection principle, namely: ul
  • ⁇ ul, vl ⁇ is the coordinates of the object on the imaging plane of the output view
  • ⁇ x, y, z ⁇ is the space of the object Three-dimensional coordinates.
  • Step 74 using the mapping matrix of the input view obtained in step 71 and the spatial three-dimensional coordinates of the objects obtained in steps 72 and 73, obtaining corresponding two-dimensional coordinates of the object on the input visual image;
  • the matrix), ⁇ u, v ⁇ is the two-dimensional coordinates of the object on the imaging plane of the input view, by which the corresponding ⁇ u, v ⁇ can be solved;
  • Step 75 Calculate, according to the pixel value corresponding to the corresponding ⁇ u, v ⁇ , a pixel value of the coordinate ⁇ u, vl ⁇ in the output view image, where specifically:
  • u and V obtained in step 74 are integers, the corresponding pixel value in the input view image pointed by the coordinate ⁇ u, v ⁇ can be directly used to calculate the pixel whose coordinates are ⁇ ul, vl ⁇ in the output view image.
  • u and V obtained in step 74 are not all integers, weighted average calculation or mathematical operation is performed on a plurality of pixels whose position coordinates ⁇ u, v ⁇ are in accordance with a predetermined requirement, and the calculated weighted average calculation or The mathematical operation result is used to calculate a pixel value of the coordinate ⁇ u, vl ⁇ in the output view image; wherein, the predetermined requirement may be the closest distance or the distance may be less than a predetermined value, etc., and the corresponding plurality of pixels may be 4 pixels. Or it can be 6 pixels or 8 pixels, etc.;
  • Step 76 Obtain a pixel value of an adjacent point of the output image in the coordinate ⁇ ul, vl ⁇ , perform weighted averaging or mathematical operation on the obtained pixel value, and calculate the result as a value of 2;
  • Step 77 Perform a weighted average or mathematical calculation on the values 1 and 2, and use the calculation result as the pixel value of the output image as ⁇ ul, vl ⁇ ;
  • Step 78 repeating steps 72 to 75, the pixel values of all the pixels of the output view image are obtained, and the obtained output view image is used as the reference image of the current coded view, and the next prediction and encoding operation is performed.
  • the corresponding decoding process specifically includes: Step 79: Parsing the encoded code stream to obtain camera parameters and depth information of the object;
  • Step 710 parsing the encoded code stream, obtaining a reconstructed image of the input view A, and treating the current view as an output view;
  • Step 711 using the camera parameters to obtain a mapping matrix of each view;
  • Step 712 Using the depth information of the object, obtaining a z component of the three-dimensional coordinate of the object;
  • Step 713 using the mapping matrix obtained in step 711 and the z component of the three-dimensional coordinates of the object obtained in step 712, and obtaining the X and y components of the volume three-dimensional coordinates;
  • mapping matrix ⁇ ul, vl ⁇ is the coordinate of the object on the imaging plane of the output view, and ⁇ x, y, z ⁇ is the three-dimensional coordinate of the object in space.
  • Step 714 using the mapping matrix obtained in step 711 and the spatial three-dimensional coordinates of the objects obtained in steps 712 and 713, obtaining corresponding coordinates ⁇ u, v ⁇ of the object on the input visual image;
  • Step 715 Determine, according to the corresponding pixel value in the input view image pointed to by ⁇ u, v ⁇ , a pixel value of the coordinate in the output view image as ⁇ ul, vl ⁇ , where the method may include:
  • step 714 If u and V obtained in step 714 are integers, the corresponding pixel value in the input view image pointed by the coordinate ⁇ u, v ⁇ is directly used to calculate the pixel value of the coordinate ⁇ u, vl ⁇ in the output view image. ;
  • u and V obtained in step 714 are not all integers, weighted average calculation or mathematical operation is performed on a plurality of pixels whose position coordinates ⁇ u, v ⁇ are in accordance with a predetermined requirement, and the weighted average or math obtained by the calculation is obtained.
  • the calculation result is used to output a pixel whose coordinates are ⁇ ul, vl ⁇ in the view image; wherein the predetermined requirement may be the closest distance or the distance may be less than a predetermined value, etc., and the corresponding plurality of pixels may be 4 pixels or Can be 6 pixels or 8 pixels, etc.; the above calculation result is recorded as the value 1;
  • Step 716 obtaining pixel values of the adjacent points of the output image in the coordinates ⁇ ul, vl ⁇ , performing weighted averaging or mathematical operations on the obtained pixel values, and calculating the result as a value of 2;
  • Step 717 Perform weighted averaging or mathematical calculation on the value 1 and the value 2, and use the calculation result as the pixel value of the coordinate ⁇ ul, vl ⁇ in the output image;
  • An embodiment of the present invention further provides an image processing apparatus.
  • the specific implementation structure is as shown in FIG. 5, and may include the following elements:
  • An image obtaining unit 501 configured to acquire at least two views of the image
  • the parameter obtaining unit 502 is configured to acquire the depth information of the object and the camera parameter information of each view; the update processing unit 503, the camera parameter information acquired by the parameter acquiring unit 502 and the depth information of the object, and the image acquiring unit 501
  • the acquired at least two view images are subjected to filtering (ie, updating) processing on at least one view image to obtain an updated processed image.
  • the update processing unit 503 may specifically include the following units:
  • the mapping matrix calculation unit 5031 is configured to calculate, according to the camera parameter information of each view acquired by the parameter acquiring unit 501, a mapping matrix corresponding to each view, where the mapping matrix is a point of each point or an image block in each view image.
  • a pixel value calculation unit 5032 configured to: according to the mapping matrix determined by the mapping matrix calculation unit 5031 (the view corresponding to the image to be updated), and the depth information of the object acquired by the parameter acquisition unit 502, combined with the object Calculating the three-dimensional coordinates of the object in the two-dimensional coordinates in the image of the output view acquired by the image acquiring unit 501 (ie, the image to be updated); and further using the mapping matrix of the input view (the view corresponding to the updated reference image)
  • the three-dimensional coordinates calculate two-dimensional coordinates of an input view image (ie, an updated reference view image) acquired by the image acquiring unit 501, and determine a pixel value corresponding to the two-dimensional coordinate;
  • An update image generating unit 5033 configured to assign a pixel value corresponding to the two-dimensional coordinate determined by the pixel value calculating unit 5032 to a two-dimensional coordinate of the object in the output view, to obtain the output image;
  • the update image generating unit 5033 may specifically include the following two units:
  • the first assigning unit 50331 is configured to directly use the corresponding pixel value as the pixel value corresponding to the two-dimensional coordinate of the object in the output view when the two-dimensional coordinate is an integer;
  • the second assigning unit 50332 is configured to: when the two-dimensional coordinates are not all integers, use a weighted average of a predetermined number of pixels that meet the predetermined requirement from the two-dimensional coordinates as a two-dimensional coordinate corresponding to the object in the output view The pixel value.
  • the update image generating unit 5033 may specifically include any one of the following units:
  • a first generating unit configured to use the two-dimensional coordinates in the image of the input view determined by the pixel value calculating unit 5032 Corresponding pixel values are assigned to corresponding coordinate points or image blocks in the output view image to obtain an output image;
  • a second generating unit configured to perform a mathematical operation on the pixel value corresponding to the two-dimensional coordinates in the at least two input view images determined by the pixel value calculating unit 5032, and assign the calculated result to the output view image The corresponding coordinate point or image block to obtain an output image;
  • a third generating unit configured to perform a mathematical operation on a pixel value corresponding to the two-dimensional coordinates in the at least two input view images determined by the pixel value calculating unit 5032 to obtain a first value; and in the image viewed from the output Corresponding coordinate points or pixel values of the available points or image blocks adjacent to the image block are mathematically operated to obtain a second value; mathematical operations are performed on the first value and the second value, and the operation result is assigned to the output view The corresponding coordinate point or image block in the image is obtained to obtain an output image.
  • mapping matrix calculation unit 5031 is an optional unit, that is, other units may be used instead of the corresponding mapping matrix calculation unit to implement conversion processing between the two-dimensional coordinates and the three-dimensional coordinates by other calculation methods.
  • the device provided by the embodiment of the present invention may further include a size expansion operation unit, configured to expand the size of the image to be updated to when the size of the image to be updated is inconsistent with the updated image size.
  • the expansion method may include: calculating a scaling factor between the current updated image size and the size of the image to be updated, and multiplying the existing pixel coordinates in the image to be updated by the scaling factor, The pixel value of the corresponding pixel in the image to be updated is assigned to the position corresponding to the new coordinate calculated in the extended image; if the calculated coordinate value is not an integer, the fractional part of the coordinate value is removed or rounded up according to the rounding rule; Or, when the size of the image to be updated is inconsistent with other view image sizes, the size of the image to be updated is expanded to be consistent with other view image sizes, and the expansion manner may include: calculating other view image sizes and waiting Update the scale factor of the size of the viewed image, and the existing pixel coordinates in the image to be updated Multi
  • the corresponding input view and output view may be corresponding views of different cameras, or the input view and output view may be corresponding views of the same camera at different positions.
  • the above image processing apparatus can be applied to a corresponding encoding apparatus or decoding apparatus, wherein:
  • the corresponding encoding device may include:
  • a reference image obtaining unit 601 configured to acquire a reference image that encodes an image of a current view
  • An image processing apparatus configured to process the reference image acquired by the reference image acquiring unit 601 to obtain an output image
  • the encoding unit 602 is configured to perform an encoding operation by using an output image provided by the device at the image as a reference image in encoding the current view image.
  • the corresponding decoding device may include:
  • a reference image obtaining unit 701 configured to acquire a reference image for decoding an image of a current view
  • An image processing apparatus configured to process the reference image acquired by the reference image acquiring unit 701 to obtain an output image
  • the decoding unit 702 is configured to perform a decoding operation by using an output image provided by the image processing apparatus as a reference image in decoding the current view image.
  • the image processing apparatus provided by the embodiment of the present invention may also be used in an apparatus for implementing image upsampling, and the specific implementation structure is as shown in FIG.
  • An image obtaining unit 801 configured to acquire an image that needs to be subjected to upsampling processing
  • the integer pixel processing unit 802 is configured to retain pixels of the entire pixel position in the image acquired by the image acquiring unit 801 in the output image;
  • the image processing device is configured to perform processing on the sub-pixel position in the image acquired by the image acquiring unit 801 to obtain a pixel value corresponding to the sub-pixel position;
  • the sampling result generating unit 803 is configured to add a pixel value corresponding to the sub-pixel position obtained by the image processing device to the output image obtained by the whole pixel processing unit 802, and obtain an image after the upsampling process.
  • the embodiments of the present invention may be used to process an input view image (such as a reference image or an image that needs to be subjected to upsampling processing) according to each camera parameter of the acquired image and depth information of the object, which may reduce non-critical or The unimportant image encodes the number of bits, thereby improving the performance of the multi-view codec and simplifying the processing of the multi-view codec, and at the same time, the computational resources required for encoding and decoding can be reduced due to the reduction of the number of coded bits.
  • the embodiment of the present invention can also be applied to the case of camera movement, improving the prediction and coding and decoding efficiency in this case.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un procédé de traitement d'image, un procédé de codage/décodage et un appareil, lesdits procédés consistant principalement à obtenir les images d'au moins deux vues, les informations des paramètres de la caméra et les informations de profondeur d'un objet, et, en fonction des images d'au moins deux vues, des informations des paramètres de la caméra et des informations de profondeur de l'objet, et à exécuter un traitement de mise à jour sur l'image d'au moins une vue. Le mode de réalisation de la présente invention peut être utilisé pour exécuter un traitement de mise à jour sur l'image de la vue destinée à être mise à jour en fonction des paramètres de la caméra des images obtenues et des informations de profondeur de l'objet, pour obtenir les images prévues dans le processus de codage ou de décodage, de manière à réduire le nombre de bits de codage de l'image destinée à être mise à jour, et réduire la charge de calcul du codage et du décodage destinés à être exécutés.
PCT/CN2009/070069 2008-01-11 2009-01-07 Procédé de traitement d'image, procédé de codage/décodage et appareil WO2009089785A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200810056088.3 2008-01-11
CN 200810056088 CN101483765B (zh) 2008-01-11 2008-01-11 一种图像处理方法、编解码方法及装置

Publications (1)

Publication Number Publication Date
WO2009089785A1 true WO2009089785A1 (fr) 2009-07-23

Family

ID=40880674

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/070069 WO2009089785A1 (fr) 2008-01-11 2009-01-07 Procédé de traitement d'image, procédé de codage/décodage et appareil

Country Status (2)

Country Link
CN (1) CN101483765B (fr)
WO (1) WO2009089785A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009131688A2 (fr) * 2008-04-25 2009-10-29 Thomson Licensing Modes de saut intervues avec profondeur
CN104780383A (zh) * 2015-02-02 2015-07-15 杭州电子科技大学 一种3d-hevc多分辨率视频编码方法
CN111739097A (zh) * 2020-06-30 2020-10-02 上海商汤智能科技有限公司 测距方法及装置、电子设备及存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103716641B (zh) * 2012-09-29 2018-11-09 浙江大学 预测图像生成方法和装置
CN103456007B (zh) * 2013-08-09 2016-12-28 华为终端有限公司 一种获取深度信息的方法和装置
CN103428499B (zh) * 2013-08-23 2016-08-17 清华大学深圳研究生院 编码单元的划分方法及使用该方法的多视点视频编码方法
CN104994360B (zh) * 2015-08-03 2018-10-26 北京旷视科技有限公司 视频监控方法和视频监控系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1281569A (zh) * 1997-12-05 2001-01-24 动力数字深度研究有限公司 改进的图像转换和编码技术
US20070109409A1 (en) * 2004-12-17 2007-05-17 Sehoon Yea Method and System for Processing Multiview Videos for View Synthesis using Skip and Direct Modes
CN101056398A (zh) * 2006-03-29 2007-10-17 清华大学 一种多视编码过程中获取视差矢量的方法及编解码方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1281569A (zh) * 1997-12-05 2001-01-24 动力数字深度研究有限公司 改进的图像转换和编码技术
US20070109409A1 (en) * 2004-12-17 2007-05-17 Sehoon Yea Method and System for Processing Multiview Videos for View Synthesis using Skip and Direct Modes
CN101056398A (zh) * 2006-03-29 2007-10-17 清华大学 一种多视编码过程中获取视差矢量的方法及编解码方法

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009131688A2 (fr) * 2008-04-25 2009-10-29 Thomson Licensing Modes de saut intervues avec profondeur
WO2009131688A3 (fr) * 2008-04-25 2010-01-28 Thomson Licensing Modes de saut intervues avec profondeur
US8532410B2 (en) 2008-04-25 2013-09-10 Thomson Licensing Multi-view video coding with disparity estimation based on depth information
KR20160054026A (ko) * 2008-04-25 2016-05-13 톰슨 라이센싱 깊이 정보에 기초한 디스패리티 예측을 구비한 다중 시점 비디오 코딩
KR101617842B1 (ko) 2008-04-25 2016-05-18 톰슨 라이센싱 깊이 정보에 기초한 디스패리티 예측을 구비한 다중 시점 비디오 코딩
KR101727311B1 (ko) 2008-04-25 2017-04-14 톰슨 라이센싱 깊이 정보에 기초한 디스패리티 예측을 구비한 다중 시점 비디오 코딩
CN104780383A (zh) * 2015-02-02 2015-07-15 杭州电子科技大学 一种3d-hevc多分辨率视频编码方法
CN104780383B (zh) * 2015-02-02 2017-09-19 杭州电子科技大学 一种3d‑hevc多分辨率视频编码方法
CN111739097A (zh) * 2020-06-30 2020-10-02 上海商汤智能科技有限公司 测距方法及装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN101483765B (zh) 2011-09-21
CN101483765A (zh) 2009-07-15

Similar Documents

Publication Publication Date Title
JP6501240B2 (ja) 点群を圧縮する方法
WO2020001168A1 (fr) Procédé, appareil et dispositif de reconstruction tridimensionnelle, et support d'informations
US20200051269A1 (en) Hybrid depth sensing pipeline
WO2009089785A1 (fr) Procédé de traitement d'image, procédé de codage/décodage et appareil
CN105491364B (zh) 编码装置及其控制方法
CN107888928B (zh) 运动补偿预测方法和设备
KR102254986B1 (ko) 구면 투영부들에 의한 왜곡을 보상하기 위한 등장방형 객체 데이터의 프로세싱
CN105681805A (zh) 视频编码、解码方法及其帧间预测方法和装置
KR102141319B1 (ko) 다시점 360도 영상의 초해상화 방법 및 영상처리장치
US11711535B2 (en) Video-based point cloud compression model to world signaling information
WO2017124298A1 (fr) Procédé de codage et de décodage vidéo, et procédé de prédiction intertrame, appareil et système associés
WO2018113339A1 (fr) Procédé et dispositif de construction d'image de projection
JP6232075B2 (ja) 映像符号化装置及び方法、映像復号装置及び方法、及び、それらのプログラム
CN113793255A (zh) 用于图像处理的方法、装置、设备、存储介质和程序产品
TWI684359B (zh) 用於沉浸式視頻編解碼的信令語法的方法及裝置
JP7171169B2 (ja) ライトフィールド・コンテンツを表す信号を符号化する方法および装置
CN115330935A (zh) 一种基于深度学习的三维重建方法及系统
US10257488B2 (en) View synthesis using low resolution depth maps
US11206427B2 (en) System architecture and method of processing data therein
JP4937161B2 (ja) 距離情報符号化方法,復号方法,符号化装置,復号装置,符号化プログラム,復号プログラムおよびコンピュータ読み取り可能な記録媒体
WO2019185983A1 (fr) Procédé, appareil et produit-programme d'ordinateur destinés au codage et au décodage de vidéo volumétrique numérique
WO2021053735A1 (fr) Dispositif de conversion ascendante, procédé de conversion ascendante et programme de conversion ascendante
CN114612621A (zh) 基于三维倾斜模型的全景图生成方法及系统
WO2024020211A1 (fr) Procédés, systèmes et appareils de prédiction intra
CN117649435A (zh) 一种单目深度估计方法、系统、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09702465

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09702465

Country of ref document: EP

Kind code of ref document: A1