WO2015024362A1 - 一种图像处理方法及设备 - Google Patents

一种图像处理方法及设备 Download PDF

Info

Publication number
WO2015024362A1
WO2015024362A1 PCT/CN2014/070138 CN2014070138W WO2015024362A1 WO 2015024362 A1 WO2015024362 A1 WO 2015024362A1 CN 2014070138 W CN2014070138 W CN 2014070138W WO 2015024362 A1 WO2015024362 A1 WO 2015024362A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
video image
encoding
foreground
depth
Prior art date
Application number
PCT/CN2014/070138
Other languages
English (en)
French (fr)
Inventor
郭岩岭
王田
张德军
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP14838494.4A priority Critical patent/EP2999221A4/en
Priority to JP2016526410A priority patent/JP6283108B2/ja
Publication of WO2015024362A1 publication Critical patent/WO2015024362A1/zh
Priority to US14/972,222 priority patent/US9392218B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/29Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding involving scalability at the object level, e.g. video object layer [VOL]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Definitions

  • the present invention relates to the field of image data processing technologies, and in particular, to an image processing method and device. Background technique
  • the immersion conferencing system is typical of next-generation multimedia conferencing systems, providing a more realistic, immersive user experience. It typically applies new technologies such as: high-definition audio/video, stereo, 3D video, augmented reality, etc. to give users an immersive experience when attending a conference.
  • the immersion conferencing system includes two cameras, a traditional color camera that captures the user's color image, and a depth camera that captures the depth of the user's distance from the camera.
  • the depth information captured by the depth information camera is of great help to subsequent 3D image synthesis, bone recognition and tracking techniques.
  • An embodiment of the present invention provides an image processing method and device, which implements image segmentation of different users in a video image, and adopts different encoding modes for the segmented image, so that the video image transmission process can be reduced. Network bandwidth usage.
  • an image processing method including:
  • the video image frame corresponding to the video image at the same time is segmented by using the depth image, to obtain a foreground in the video image frame.
  • the outline of the image including:
  • the outputting the depth data corresponding to the contour includes: [18] simplifying the depth image into a binary image according to the contour ;
  • the binary image is encoded, and the encoded data corresponding to the binary image is output.
  • the outputting the depth data corresponding to the contour of the foreground image includes:
  • an image processing method including:
  • the method further includes:
  • an image processing apparatus including:
  • an acquisition module configured to acquire a video image and a depth image of the object
  • a contour segmentation module configured to segment, by using the depth image, a video image frame corresponding to the video image at the same time, to obtain the The outline of the foreground image in the video image frame
  • a video encoding module configured to perform first encoding on a video image pixel within the contour of the foreground image in the video image frame according to an outline of the foreground image, where the video image frame is Performing a second encoding on a video image pixel other than the contour to obtain the encoded data corresponding to the video image frame, where the encoding rate of the first encoding is higher than the encoding rate of the second encoding;
  • a first output module configured to output encoded data corresponding to the video image frame
  • the second output module is configured to use depth data corresponding to the contour of the foreground image.
  • the contour segmentation module includes:
  • a pixel alignment unit configured to perform pixel alignment on the depth image of the video image frame and the video image
  • a depth difference calculation unit configured to calculate each pixel on the depth image and each adjacent Determining the depth of the pixel, determining that the pixel of the pixel that has a variance of the depth difference of all the adjacent pixels is greater than a preset threshold is a segmentation point;
  • the second output module includes: [41] a binary image simplification unit, configured to simplify the depth image into a binary image according to the contour;
  • a binary image encoding unit configured to perform encoding processing on the binary image
  • a binary image output unit configured to output encoded data corresponding to the binary image.
  • the second output module includes:
  • a coordinate acquiring unit configured to acquire coordinate information of each of the dividing points
  • a compressing unit configured to perform compression processing on all the coordinate information
  • an image processing apparatus including: [49] a receiving module, configured to receive encoded data of a video image frame and depth data corresponding to a contour of a foreground image in the video image frame, where the video In the image frame, according to the contour of the foreground image, first encoding the video image pixel points within the contour of the foreground image, and performing second encoding on the video image pixel points other than the contour to obtain the video image.
  • a foreground image segmentation module configured to perform, according to the depth data, the video image frame Dividing, obtaining a foreground image in the video image frame;
  • the foreground playing module is configured to play the foreground image.
  • the method further includes:
  • the background playing module is configured to play a preset background image or a picture while playing the foreground image, and use the background image or the picture as a background of the foreground image.
  • an image processing system comprising the image processing device of any one of the above two.
  • an outline of a foreground image in the video image frame is obtained by dividing a video image frame, and according to an outline of the foreground image, a "foreground” image in the video image frame and a
  • the background image encodes the video image pixels included in the "foreground” image and the "background” image separately
  • BP encodes the "foreground” image within the contour at a higher encoding rate.
  • a lower coding rate encoding is applied to the "background” image outside the contour.
  • [57] 1 is a flowchart of an embodiment of an image processing method provided by the present invention.
  • [58] 2 is an implementation flowchart of step 101 in FIG. 1;
  • [59] 3 is a flowchart of another embodiment of an image processing method provided by the present invention.
  • [60] 4 is a structural block diagram of an embodiment of an image processing apparatus provided by the present invention.
  • [61] 5 is a structural block diagram of the contour segmentation module in FIG. 4;
  • [62] 6 is a structural block diagram of the second output module 405 in FIG. 4;
  • [63] 7 is another structural block diagram of the second output module 405 in FIG. 4;
  • [64] 8 is a structural block diagram of another embodiment of an image processing apparatus provided by the present invention.
  • [65] 9 is a structural block diagram of another embodiment of an image processing apparatus provided by the present invention.
  • FIG. 10 is a specific application scenario provided by the present invention.
  • a flow of an embodiment of an image processing method provided by the present invention may specifically include: [70] Step 101: acquiring a video image and a depth image of an object;
  • This embodiment describes an image processing method on the video image transmitting side.
  • the color camera can capture the user's color video image, and at the same time, through the depth camera, the depth image of the user's distance from the camera can be captured.
  • Step 102 The video image frame corresponding to the video image at the same time is divided by the depth image to obtain an outline of the foreground image in the video image frame.
  • the image processing method, apparatus and system provided by the present invention can be applied to an immersion conference system.
  • the video image with higher user attention is only a part of the video image actually received. This part of the user pays attention to this part of the video image is called “foreground”, while other users in the video image
  • the portion of the video image that is less of a concern is called the "background.”
  • the images of these people are “foreground”; images other than these people that are not of interest to the user are For "background”.
  • the video image frame corresponding to the video image at the same time is segmented by using the depth image, and the contour of the foreground image in the video image frame is obtained by image segmentation.
  • the outline of the foreground image in each video image frame can be obtained.
  • the contour of the foreground image By the contour of the foreground image, the "foreground” and “background” in the video image frame can be separated, and within the pixel range of the video image frame, all the pixels within the contour constitute the "foreground” The image, and all pixels outside the outline constitute a "background” image.
  • Step 103 Perform, according to an outline of the foreground image, a first encoding of a video image pixel within the contour of the foreground image in the video image frame, other than the contour in the video image frame.
  • the video image pixel is subjected to the second encoding to obtain the encoded data corresponding to the video image frame, where the encoding rate of the first encoding is higher than the encoding rate of the second encoding.
  • the video image frame is subjected to ROI (Region of Interest) encoding using the contour of the foreground image, and the video outside the outline of the foreground image in the video image frame and beyond
  • the image pixels are encoded by different coding modes, including: encoding a "foreground” image within a contour of the foreground image in the video image at a higher coding rate, other than the contour of the foreground image in the video image
  • the "background” image is encoded at a lower encoding rate.
  • Step 104 Output coded data corresponding to the video image frame and depth data corresponding to the contour of the foreground image.
  • the encoded data corresponding to the video image frame and the depth data corresponding to the contour are output together, so that the receiving end can obtain the segmentation according to the depth data corresponding to the contour of the foreground image.
  • the outline of the "foreground” image and the "background” image in the video image frame, and then the "foreground” image in the video image frame is obtained from the decoded video image frame based on the contour of the foreground image.
  • an outline of a foreground image in the video image frame is obtained by dividing a video image frame, and according to the contour of the foreground image, a "foreground” image in the video image frame and a
  • the background image encodes the video image pixels included in the "foreground” image and the "background” image in different ways, ie: encoding the "foreground” image within the contour with a higher encoding rate , a lower coding rate encoding is applied to the "background” image outside the contour.
  • the utilization of the number of bits in the encoding process can be reduced, the occupancy rate of the network bandwidth during the transmission of the video image frame can be reduced, and the image quality of the "foreground" image can be enhanced.
  • the occupancy rate of the network bandwidth during the video image transmission process can be further reduced.
  • Step 101 the video image frame corresponding to the video image at the same time is segmented by using the depth image to obtain an outline of a foreground image in the video image frame, and a specific implementation manner of the step is performed.
  • the following may specifically include the following steps: [84] Step 201: align a depth image of the video image frame with a video image;
  • the depth image acquired by the depth camera at the same time and the color image captured by the color camera are pixel aligned.
  • the resolution of the color image is higher than the resolution of the depth information image
  • the resolution of the color image is downsampled to the resolution of the depth information image
  • the resolution of the color image is lower than the depth information image
  • the resolution of the color image is upsampled to the resolution of the depth information image
  • no processing is required.
  • Step 202 Calculate a depth difference between each pixel point and each adjacent pixel point on the depth image, and determine that a variance of a depth difference between the pixel points and all the adjacent pixel points is greater than a preset threshold.
  • the pixel is a segmentation point; [88] in this step, calculating a depth difference between each pixel point and each adjacent pixel point on the depth image, usually calculating each pixel point and adjacent 8 pixel points on the depth image The difference between the depth values.
  • the value of each pixel on the depth image is actually the projected coordinate of the spatial point, that is, the distance Z from the spatial point to the plane where the depth sensor is located, in ⁇ .
  • Step 203 Traverse all the pixels on the depth image to determine all the division points
  • Step 204 Extract an outline of the foreground image in the video image according to all the dividing points.
  • the embodiment of the present invention in the process of transmitting a video image frame to the receiving end, the depth data corresponding to the contour is also required to be transmitted.
  • the embodiment of the present invention provides the following two processing methods:
  • a JBIG2 encoder can be applied when encoding a binary image.
  • (2) acquiring coordinate information of each of the division points; performing compression processing on all the coordinate information, and outputting compression data corresponding to all the coordinate information obtained by the compression processing.
  • the coordinate information of all the division points is obtained, including the spatial coordinates or the vector coordinates of the pixel corresponding to the division point, and the spatial coordinates are, for example, (x, y) coordinates. Then, the coordinate information of all the split points is grouped together, for example: expressed as a data set. The data set containing all the division point coordinate information is compressed and transmitted through the transmission network to the receiving end of the video image.
  • the above embodiment mainly describes an image processing method on the image transmitting side in image processing. Accordingly, the present invention also provides an image processing method mainly for an image processing method on an image receiving side in an image processing process.
  • a flow of an embodiment of an image processing method may specifically include: [103] Step 301: Receive encoded data of a video image frame and contour corresponding to a foreground image in the video image frame. Depth data, in the video image frame, performing first coding on a video image pixel within the contour of the foreground image according to an outline of the foreground image, and performing second on a video image pixel other than the contour Encoding, obtaining encoded data corresponding to the video image frame, where an encoding rate of the first encoding is higher than an encoding rate of the second encoding.
  • the receiving side receives the encoded data corresponding to the video image frame transmitted by the image transmitting side and the depth data corresponding to the contour of the foreground image in the video image frame.
  • the video image transmitting side has obtained the outline of the foreground image in the video image frame by image segmentation, and through the contour, separates "foreground” and "background” in the video image frame, in the video image frame, All pixels within the outline constitute a "foreground” image, and all pixels outside the outline constitute a "background” image.
  • Step 302 Segment the video image frame according to the depth data to obtain a foreground image in the video image frame.
  • the receiving side can decode the received encoded data to obtain a video image collected by the transmitting side, and according to the received depth data, the received video image can be segmented, and then the received video is obtained.
  • the foreground image in the video image. This part of the foreground image is usually the part of the image that the video user is more concerned about.
  • Step 303 Play the foreground image.
  • the playback of the foreground image may be performed. Since the background image other than the foreground image in the received video image is often not of interest to the user, the playback of the background image may not be performed.
  • the background image other than the foreground image in the received video image is often not of interest to the user, the playback of the background image may not be performed.
  • the foreground image when the foreground image is played, the foreground image may be set to be played on a playback window of the conference interface in the video conference system.
  • the video image transmitting side obtains the contour of the foreground image in the video image frame by dividing the video image frame, according to the contour of the foreground image, the video image frame may be distinguished.
  • the foreground "image” and “background” images, and in turn, the video image pixels included in the "foreground” image and the “background” image are encoded differently, ie: the "foreground” image within the contour is used higher
  • the encoding rate is encoded in a lower encoding rate for "background” images other than the contour.
  • the utilization of the number of bits in the encoding process can be reduced, the occupancy rate of the network bandwidth during the video image frame transmission process can be reduced, and the image quality of the "foreground" image can be enhanced.
  • the occupancy rate of the network bandwidth during the video image transmission process can be further reduced.
  • the background image other than the foreground image in the received video image is often not noticed by the user, in order to enhance the immersive experience when the user participates in the video conference, the background image may not be played, but While the foreground image is being played, a preset background image or picture is played, and the background image or picture is used as the background of the foreground image.
  • a preset background image or picture is played, and the background image or picture is used as the background of the foreground image.
  • the portrait of the other party talking with the current system user is usually a "foreground” image, and when playing this part of the "foreground” image, the preset background image or picture can be played.
  • the preset background image or picture is used as a "background” image of the conference interface, and is played together with the portrait of the counterpart user displayed on the conference interface.
  • an embodiment of an image processing device provided by the present invention may specifically include:
  • an acquisition module 401 configured to collect a video image and a depth image of the object
  • the contour segmentation module 402 is configured to segment the video image frame corresponding to the video image at the same time by using the depth image to obtain an outline of the foreground image in the video image frame;
  • a video encoding module 403 configured to perform, according to an outline of the foreground image, a first encoding of a video image pixel within the contour of the foreground image in the video image frame, where the video image frame is The video image pixel points other than the contour are subjected to the second encoding to obtain the encoded data corresponding to the video image frame, where the encoding rate of the first encoding is higher than the encoding rate of the second encoding;
  • a first output module 404 configured to output encoded data corresponding to the video image frame; [120] a second output module 405, configured to use depth data corresponding to the contour of the foreground image
  • the video image and the depth image of the object are collected by the acquisition module, and the contour of the foreground image in the video image frame is obtained by dividing the video image frame by the contour segmentation module in the image processing device.
  • the contour can distinguish the "foreground” image and the "background” image in the video image frame, and further, through the video encoding module, different ways of video image pixel points respectively included in the "foreground” image and the "background” image
  • the encoding process that is, the encoding of the "foreground” image within the contour with a higher encoding rate, and the encoding of the "background” image other than the contour with a lower encoding rate.
  • the utilization of the number of bits in the encoding process can be reduced, the occupancy rate of the network bandwidth during video image transmission is reduced, and the image quality of the "foreground" image is enhanced.
  • the occupancy rate of the network bandwidth during the video image transmission process can be further reduced.
  • the contour segmentation module 402 may specifically include:
  • a pixel alignment unit 501 configured to perform pixel alignment on the depth image of the video image frame and the video image;
  • the depth difference calculation unit 502 is configured to calculate a depth difference between each pixel point and each adjacent pixel point on the depth image, and determine that a variance of a depth difference between the pixel points and all the adjacent pixel points is greater than a pixel of the preset threshold is a segmentation point; [125] a segmentation point determining unit 503, configured to traverse all the pixels on the depth image, and determine all the segmentation points;
  • the contour obtaining unit 504 is configured to obtain a contour of the foreground image in the video image according to the all dividing points.
  • the second output module 405 is specific.
  • Can include:
  • a binary image simplification unit 601 configured to reduce the depth image into a binary image according to the contour
  • a binary image encoding unit 602 configured to perform encoding processing on the binary image
  • the binary image output unit 603 is configured to output encoded data corresponding to the binary image.
  • the values of all the pixels in the outline are set to 0, and the values of all the pixels except the outline are set to 1.
  • the depth image reduced to the binary image is encoded, and then the binary image and the color image are respectively outputted by the encoded code stream, and transmitted to the receiving end of the video image through the transmission network.
  • the second output module 405 may specifically include:
  • a coordinate acquiring unit 701 configured to acquire coordinate information of each of the dividing points
  • a compressing unit 702 configured to perform compression processing on all the coordinate information
  • the coordinate output unit 703 is configured to output compressed data corresponding to all the coordinate information obtained by the compression processing.
  • coordinate information of all the division points is obtained, including spatial coordinates or vector coordinates of the pixel corresponding to the division point, and the space is, for example, (x, y) coordinates.
  • the coordinate information of all the split points is grouped together, for example: expressed as a data set.
  • the data set containing all the division point coordinate information is compressed and transmitted through the transmission network to the receiving end of the video image.
  • the image processing device is a corresponding device on the image transmitting side in the image processing process.
  • an image processing device is provided, which is a corresponding device on the image receiving side during image processing.
  • an embodiment of an image processing device provided by the present invention may specifically include:
  • the receiving module 801 is configured to receive encoded data of a video image frame and depth data corresponding to a contour of the foreground image in the video image frame, where the video image frame is configured according to an outline of the foreground image. Performing a first encoding on a pixel of the video image within the contour of the foreground image, performing second encoding on the pixel of the video image other than the contour, and obtaining encoded data corresponding to the video image frame, where the encoding of the first encoding a rate higher than a coding rate of the second code;
  • foreground image segmentation module 802 configured to segment the video image frame to obtain the video image frame Foreground image
  • the foreground play module 803 is configured to play the foreground image.
  • the video image transmitting side obtains the contour of the foreground image in the video image frame by dividing the video image frame, according to the contour of the foreground image, the video image frame may be distinguished.
  • the foreground "image” and “background” images, and in turn, the video image pixels included in the "foreground” image and the “background” image are encoded differently, ie: the "foreground” image within the contour is used higher
  • the encoding rate is encoded in a lower encoding rate for "background” images other than the contour.
  • the utilization of the number of bits in the encoding process can be reduced, the occupancy rate of the network bandwidth during the video image frame transmission process can be reduced, and the image quality of the "foreground" image can be enhanced.
  • the occupancy rate of the network bandwidth during the video image transmission process can be further reduced.
  • the image processing device may further include:
  • the background playing module 804 is configured to play a preset background image or a picture while playing the foreground image, and use the background image or the picture as a background of the foreground image.
  • the background image may not be performed in the embodiment of the present invention, because the background image of the received video image is not subject to the user's attention.
  • the image is played, but the preset background image or picture is played while the foreground image is being played, and the background image or picture is used as the background of the foreground image.
  • the present invention also provides an image processing system.
  • the system may specifically include: an image transmitting device and an image receiving device, wherein
  • the image transmitting device is configured to collect a video image and a depth image of the object; and use the depth image to segment a video image frame corresponding to the video image at the same time to obtain a foreground image in the video image frame.
  • the second encoding is performed to obtain the encoded data corresponding to the video image frame, where the encoding rate of the first encoding is higher than the encoding rate of the second encoding; and the encoded data corresponding to the video image frame is output.
  • the image receiving device is configured to receive encoded data of a video image frame and depth data corresponding to a contour of a foreground image in the video image frame, and segment the video image frame according to the depth data, to obtain a foreground image in the video image frame; playing the foreground image.
  • the contour of the foreground image in the video image frame is obtained by dividing the video image frame, and according to the contour of the foreground image, the video image frame can be distinguished.
  • the "foreground” image and the "background” image, and in turn, the video image pixels included in the "foreground” image and the “background” image are encoded differently, ie: using a "foreground” image within the outline
  • a higher coding rate coding scheme uses a lower coding rate coding scheme for "background” pictures other than the contour.
  • the image transmitting apparatus may further execute the execution flow shown in FIG. 2 and the foregoing two processing methods of outputting the depth data corresponding to the contour; the image receiving apparatus may further perform the execution shown in FIG. a process, and capable of playing a preset background image or picture while playing the foreground image, and using the background image or picture as a background of the foreground image.
  • the depth camera 1001 on the image transmitting device a side is used to capture the depth image of the distance between the user and the camera, and the color camera 1002 is configured to capture the color video image of the user to obtain the current video image.
  • the video image frame is segmented by the contour segmentation module 1003, and the video image frame is segmented by the depth image to obtain an outline of the foreground image in the video image frame; according to the contour, the depth image is simplified to a binary value And performing image encoding processing on the binary image by using the JBIG2 encoder 1004, and simultaneously performing R0I encoding on the video image frame by using the contour of the foreground image by the video encoder 1005 based on the ROI encoding;
  • the JBIG2 decoder 1007 and the R0I decoder 1008 are transmitted to the image receiving device b side through the network 1006; the outline of the foreground image in the video image frame is obtained by the JBIG2 decoder, and further, the video image frame is obtained by the foreground image dividing module 1009.
  • the foreground image in the middle is separately segmented; further, In the background playing module 1010, the preset background image or picture is played while the foreground image is being played, and the preset
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple networks. On the unit. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions described may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including The instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a removable hard disk, a read-only memory (ROM), a random access memory (RAM), a disk or an optical disk, and the like, which can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明实施例公开了一种图像处理方法和设备。其中,所述方法包括:采集物体的视频图像和深度图像;利用所述深度图像对同一时刻的所述视频图像对应的视频图像帧进行分割,得到所述视频图像帧中前景图像的轮廓;根据所述前景图像的轮廓,对所述视频图像帧中所述前景图像的轮廓以内的视频图像像素点进行第一编码,对所述视频图像帧中所述轮廓以外的视频图像像素点进行第二编码,获得所述视频图像帧对应的编码数据,其中,所述第一编码的编码速率高于所述第二编码的编码速率;输出所述视频图像帧对应的编码数据以及所述前景图像的轮廓对应的深度数据。通过本发明实施例,使得可以降低视频图像传输过程中的网络带宽占用率。

Description

一种图像处理方法及设备
本申请要求于 2013 年 8 月 19 日提交中国专利局、 申请号为 201310362321.1 , 发明名称为 "一种图像处理方法及设备" 的中国专利 申请优先权, 上述专利的全部内容通过引用结合在本申请中。 技术领域
[01] 本发明涉及图像数据处理技术领域, 特别是涉及一种图像处理方法及设备。 背景技术
[02] 浸入式会议系统是下一代多媒体会议系统的典型, 它能够提供更加逼真, 身 临其境的用户体验。 其通常会应用诸如: 高清音 /视频、 立体声、 3D 视频、 增强现实等 新技术来让用户在参加会议时能够有一种浸入式的体验。
[03] 通常, 浸入式会议系统中包含两种摄像头, 一种是传统的彩色摄像头, 可以捕获用 户的彩色图像; 还有一种是深度摄像头, 可以捕获用户与摄像头的距离的深度图像。 深 度信息摄像头捕获的深度信息为后续 3D图像的合成、骨骼识别和追踪等技术都有很大的 帮助。
[04] 现有的浸入式会议系统中, 进行会议系统的视频图像合成时, 网络带宽的占用率较 高。 发明内容
[05] 本发明实施例中提供了一种图像处理方法及设备, 实现视频图像中不同用户关注度 的图像的分割, 并对分割图像采用不同的编码方式, 使得可以降低视频图像传输过程中 的网络带宽占用率。
[06] 为了解决上述技术问题, 本发明实施例公开了如下技术方案:
[07] 第一方面, 提供一种图像处理方法, 包括:
[08] 采集物体的视频图像和深度图像; [09] 利用所述深度图像对同一时刻的所述视频图像对应的视频图像帧进行分割,得到所述视 频图像帧中前景图像的轮廓;
[10] 根据所述前景图像的轮廓,对所述视频图像帧中所述前景图像的轮廓以内的视频图像像 素点进行第一编码, 对所述视频图像帧中所述轮廓以外的视频图像像素点进行第二编码, 获 得所述视频图像帧对应的编码数据, 其中, 所述第一编码的编码速率高于所述第二编码的编 码速率;
[11] 输出所述视频图像帧对应的编码数据以及所述前景图像的轮廓对应的深度数据。 [12] 结合上述第一方面, 在第一种可能的实现方式中, 所述利用所述深度图像对同一时 刻的所述视频图像对应的视频图像帧进行分割,得到所述视频图像帧中前景图像的轮廓, 包括:
[13] 将所述视频图像帧的深度图像与视频图像进行像素对齐;
[14] 计算所述深度图像上每个像素点与各个相邻像素点的深度差值, 确定像素点中与所 有所述相邻像素点的深度差值的方差大于预设阈值的像素点为分割点;
[15] 遍历所述深度图像上全部像素点, 确定出所有分割点;
[16] 根据所述所有分割点, 得出所述视频图像中前景图像的轮廓。
[17] 结合上述第一方面, 在第二种可能的实现方式中, 所述输出所述轮廓对应的深度数 据, 包括: [18] 根据所述轮廓, 将所述深度图像简化为二值图像;
[19] 将所述二值图像进行编码处理, 输出所述二值图像对应的编码数据。
[20] 结合上述第一方面, 和第一种可能的实现方式, 在第三种可能的实现方式中, 所述 输出所述前景图像的轮廓对应的深度数据, 包括:
[21] 获取所述所有分割点中每一个分割点的坐标信息; [22] 将全部所述坐标信息进行压缩处理, 输出所述压缩处理得到的全部所述坐标信息对 应的压缩数据。
[23] 第二方面, 提供一种图像处理方法, 包括:
[24] 接收视频图像帧的编码数据以及所述视频图像帧中前景图像的轮廓对应的深度数 据, 所述视频图像帧中, 根据所述前景图像的轮廓, 对所述前景图像的轮廓以内的视频 图像像素点进行第一编码, 对所述轮廓以外的视频图像像素点进行第二编码, 获得所述 视频图像帧对应的编码数据, 其中, 所述第一编码的编码速率高于所述第二编码的编码 速率;
[25] 根据所述深度数据, 对所述视频图像帧进行分割, 得到所述视频图像帧中的前景图 像;
[26] 播放所述前景图像。 [27] 结合上述第二方面, 在第一种可能的实现方式中, 还包括:
[28] 在播放所述前景图像的同时, 播放预置的背景图像或图片, 将所述背景图像或图片 作为作数前景图像的背景。
[29] 第三方面, 提供一种图像处理设备, 包括:
[30] 采集模块, 用于采集物体的视频图像和深度图像; [31] 轮廓分割模块, 用于利用所述深度图像对同一时刻的所述视频图像对应的视频图像 帧进行分割, 得到所述视频图像帧中前景图像的轮廓;
[32] 视频编码模块, 用于根据所述前景图像的轮廓, 对所述视频图像帧中所述前景图像 的轮廓以内的视频图像像素点进行第一编码, 对所述视频图像帧中所述轮廓以外的视频 图像像素点进行第二编码, 获得所述视频图像帧对应的编码数据, 其中, 所述第一编码 的编码速率高于所述第二编码的编码速率;
[33] 第一输出模块, 用于输出所述视频图像帧对应的编码数据;
[34] 第二输出模块, 用于所述前景图像的轮廓对应的深度数据。
[35] 结合上述第三方面, 在第一种可能的实现方式中, 所述轮廓分割模块, 包括:
[36] 像素对齐单元, 用于将所述视频图像帧的深度图像与视频图像进行像素对齐; [37] 深度差值计算单元, 用于计算所述深度图像上每个像素点与各个相邻像素点的深度 差值, 确定像素点中与所有所述相邻像素点的深度差值的方差大于预设阈值的像素点为 分割点;
[38] 分割点确定单元, 用于遍历所述深度图像上全部像素点, 确定出所有分割点; [39] 轮廓获取单元, 用于根据所述所有分割点, 得出所述视频图像中前景图像的轮廓。 [40] 结合上述第三方面, 在第二种可能的实现方式中, 所述第二输出模块, 包括: [41] 二值图像简化单元, 用于根据所述轮廓, 将所述深度图像简化为二值图像;
[42] 二值图像编码单元, 用于将所述二值图像进行编码处理;
[43] 二值图像输出单元, 用于输出所述二值图像对应的编码数据。
[44] 结合上述第三方面, 和第一种可能的实现方式, 在第三种可能的实现方式中, 所述 第二输出模块, 包括-
[45] 坐标获取单元, 用于获取所述所有分割点中每一个分割点的坐标信息; [46] 压缩单元, 用于将全部所述坐标信息进行压缩处理;
[47] 坐标输出单元, 用于输出所述压缩处理得到的全部所述坐标信息对应的压缩数据。 [48] 第四方面, 提供一种图像处理设备, 包括: [49] 接收模块, 用于接收视频图像帧的编码数据以及所述视频图像帧中前景图像的轮廓 对应的深度数据, 所述视频图像帧中, 根据所述前景图像的轮廓, 对所述前景图像的轮 廓以内的视频图像像素点进行第一编码, 对所述轮廓以外的视频图像像素点进行第二编 码, 获得所述视频图像帧对应的编码数据, 其中, 所述第一编码的编码速率高于所述第 二编码的编码速率; [50] 前景图像分割模块, 用于根据所述深度数据, 对所述视频图像帧进行分割, 得到所 述视频图像帧中的前景图像;
[51] 前景播放模块, 用于播放所述前景图像。
[52] 结合上述第四方面, 在第一种可能的实现方式中, 还包括:
[53] 背景播放模块, 用于在播放所述前景图像的同时, 播放预置的背景图像或图片, 将 所述背景图像或图片作为作数前景图像的背景。
[54] 第五方面, 提供一种图像处理系统, 包括上述任一项两种所述的图像处理设备。
[55] 本发明实施例中, 通过分割视频图像帧得到所述视频图像帧中前景图像的轮廓, 根 据所述前景图像的轮廓, 可以区分出所述视频图像帧中的 "前景" 图像和 "背景" 图像, 进而, 对 "前景" 图像和 "背景" 图像分别包括的视频图像像素点进行不同方式的编码 处理, BP : 对所述轮廓以内的 "前景" 图像采用较高编码速率的编码方式, 对所述轮廓 以外的 "背景" 图像采用较低编码速率的编码方式。 通过该编码方式, 能够减少编码过 程中比特数的利用率, 降低视频图像帧传输过程中对网络带宽的占用率, 并增强 "前景" 图像的图像质量。 此外, 由于只传输所述轮廓对应的像素点的深度数据, 并非传输深度 图像中所有像素点对应的深度数据, 可以进一步降低视频图像传输过程中对网络带宽的 占用率。
附图说明
[56] 为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将对实施例或现有 技术描述中所需要使用的附图作简单地介绍, 显而易见地, 对于本领域普通技术人员而 言, 在不付出创造性劳动性的前提下, 还可以根据这些附图获得其他的附图。
[57] 1为本发明提供的一个图像处理方法实施例的流程图;
[58】 2为图 1中步骤 101的实现流程图;
[59] 3为本发明提供的又一个图像处理方法实施例的流程图;
[60] 4为本发明提供的一种图像处理设备实施例的结构框图;
[61] 5为图 4中轮廓分割模块的结构框图;
[62] 6为图 4中第二输出模块 405的一种结构框图;
[63] 7为图 4中第二输出模块 405的另一种结构框图;
[64] 8为本发明提供的又一个图像处理设备实施例的结构框图;
[65] 9为本发明提供的再一个图像处理设备实施例的结构框图;
[66] 图 10为本发明提供的一个具体应用场景。
具体实施方式
[67] 为了使本技术领域的人员更好地理解本发明实施例中的技术方案, 并使本发明实施 例的上述目的、 特征和优点能够更加明显易懂, 下面结合附图对本发明实施例中技术方 案作进一步详细的说明。 [68] 首先, 介绍本发明提供的一种图像处理方法。 本发明提供的图像处理方法、 设备及 系统可以应用于应浸入式会议系统。
[69] 参见图 1, 为本发明提供的一个图像处理方法实施例的流程, 具体可以包括: [70] 步骤 101、 采集物体的视频图像和深度图像;
[71] 该实施例描述的是视频图像发送侧的图像处理方法。 该步骤中, 通过彩色摄像头, 可以捕获用户的彩色视频图像, 同时, 通过深度摄像头, 可以捕获用户与摄像头的距离 的深度图像。
[72] 步骤 102、 利用所述深度图像对同一时刻的所述视频图像对应的视频图像帧进行分 害 得到所述视频图像帧中前景图像的轮廓。
[73] 本发明提供的图像处理方法、 设备及系统可以应用于应浸入式会议系统。 对于浸入 式会议系统, 实际上用户关注度较高的视频图像只是实际接收到的视频图像中的一部分, 这部分用户关注这部分的视频图像被称之为 "前景", 而视频图像中其他用户较为不关注 的视频图像部分被称之为 "背景"。 例如: 一般视频会议中, 与当前系统用户一起交谈的 其他人都是该用户所关心的, 因此, 这些人的图像即为 "前景"; 除去这些人之外的该用 户并不关注的图像即为 "背景"。 [74] 上述步骤中, 利用深度图像对同一时刻的视频图像对应的视频图像帧进行分割, 通 过图像分割, 得到该视频图像帧中前景图像的轮廓。 通过该分割方式, 可以得到每个视 频图像帧中前景图像的轮廓。
[75] 通过前景图像的该轮廓, 即可分离出视频图像帧中的 "前景 " 和 "背景", 在视频 图像帧的像素点范围之内, 所述轮廓以内的所有像素点构成 "前景" 图像, 而所述轮廓 以外的所有像素点构成 "背景" 图像。
[76] 步骤 103、 根据所述前景图像的轮廓, 对所述视频图像帧中所述前景图像的轮廓以 内的视频图像像素点进行第一编码, 对所述视频图像帧中所述轮廓以外的视频图像像素 点进行第二编码, 获得所述视频图像帧对应的编码数据, 其中, 所述第一编码的编码速 率高于所述第二编码的编码速率。 [77] 该步骤中, 利用所述前景图像的轮廓对所述视频图像帧进行 ROI ( Region of Interest , 感兴趣区) 编码, 对所述视频图像帧中前景图像的轮廓以内以及轮廓以外的 视频图像像素点采用不同的编码方式进行编码, 包括: 对所述视频图像中前景图像的轮 廓以内的 "前景" 图像采用较高编码速率的编码方式, 对所述视频图像中前景图像的轮 廓以外的 "背景" 图像采用较低编码速率的编码方式。 通过该编码方式, 能够减少编码 过程中比特数的利用率, 降低视频图像传输过程中对网络带宽的占用率, 并增强 "前景" 图像的图像质量。
[78] 步骤 104、 输出所述视频图像帧对应的编码数据以及所述前景图像的轮廓对应的深 度数据。 [79] 该步骤中, 将所述视频图像帧对应的编码数据以及所述轮廓对应的深度数据一并输 出, 从而, 接收端可以根据所述前景图像的轮廓对应的深度数据, 获得分割所述视频图 像帧中 "前景" 图像和 "背景" 图像的轮廓, 进而根据该前景图像的轮廓, 从解码之后 的视频图像帧中获得该视频图像帧中的 "前景" 图像。
[80] 由于只传输所述前景图像的轮廓对应的像素点的深度数据, 并非传输深度图像中所 有像素点对应的深度数据, 可以进一步降低视频图像传输过程中对网络带宽的占用率。
[81] 本发明实施例中, 通过分割视频图像帧得到所述视频图像帧中前景图像的轮廓, 根 据所述前景图像的轮廓, 可以区分出所述视频图像帧中的 "前景" 图像和 "背景" 图像, 进而, 对 "前景" 图像和 "背景" 图像分别包括的视频图像像素点进行不同方式的编码 处理, 即: 对所述轮廓以内的 "前景" 图像采用较高编码速率的编码方式, 对所述轮廓 以外的 "背景" 图像采用较低编码速率的编码方式。 通过该编码方式, 能够减少编码过 程中比特数的利用率, 降低视频图像帧传输过程中对网络带宽的占用率, 并增强 "前景" 图像的图像质量。 此外, 由于只传输所述轮廓对应的像素点的深度数据, 并非传输深度 图像中所有像素点对应的深度数据, 可以进一步降低视频图像传输过程中对网络带宽的 占用率。 [82] 为了便于对本发明技术方案的理解, 下面通过具体实施方式, 对本发明技术方案进 行详细说明。
[83] 具体实施时, 上述步骤 101中利用所述深度图像对同一时刻的所述视频图像对应的 视频图像帧进行分割, 得到所述视频图像帧中前景图像的轮廓, 该步骤的具体实现方式 如图 2所示, 具体可以包括以下执行步骤: [84] 步骤 201、 将所述视频图像帧的深度图像与视频图像进行像素对齐;
[85] 该步骤中, 将同一时刻深度摄像头采集到的深度图像和彩色摄像头采集到的彩色图 像进行像素对齐。
[86] 具体地, 当彩色图像的分辨率高于深度信息图像的分辨率时, 则将彩色图像的分辨 率下采样到深度信息图像的分辨率; 当彩色图像的分辨率低于深度信息图像的分辨率时, 则将彩色图像的分辨率上采样到深度信息图像的分辨率; 当彩色图像的分辨率等于深度 信息图像的分辨率时, 则不用进行处理。
[87] 步骤 202、 计算所述深度图像上每个像素点与各个相邻像素点的深度差值, 确定像 素点中与所有所述相邻像素点的深度差值的方差大于预设阈值的像素点为分割点; [88] 该步骤中, 计算所述深度图像上每个像素点与各个相邻像素点的深度差值, 通常为 计算深度图像上每个像素点与邻近 8个像素点深度值之差。 深度图像上的每一个像素的 值实际上就是该空间点的投影坐标, 即表示该空间点到深度传感器所在的平面的距离 Z, 单位是匪。 通过计算这种深度差值, 可以提取出深度图像上的不连续点, 即分割点。
[89] 对应某一个像素点, 在计算出的八个深度差值中, 当其中的一个或多个深度差值与 其它剩余的深度差值有明显的差异时, 即该八个深度差值的方差大于一定的预设阈值时, 即可以确定该像素点为分割点。
[90] 步骤 203、 遍历所述深度图像上全部像素点, 确定出所有分割点;
[91] 该步骤中, 依次对所述深度图像上的每个像素点进行上述与相邻像素点的深度差值 计算, 从而确定出所有的分割点。 [92] 步骤 204、 根据所述所有分割点, 得出所述视频图像中前景图像的轮廓。
[93] 该步骤中, 当确定出所有分割点, 所有分割点连接起来之后, 即可构成区分所述视 频图像中 "前景" 图像和 "背景" 图像的轮廓。
[94] 本发明实施例中, 在向接收端传输视频图像帧的过程中, 还需要传输所述轮廓对应 的深度数据。 对于输出所述轮廓对应的深度数据, 本发明实施例提供了如下两种处理方 法:
[95] ( 1 ) 根据所述轮廓, 将所述深度图像简化为二值图像; 将所述二值图像进行编码 处理, 输出所述二值图像对应的编码数据。
[96] 该方式下, 将所述轮廓以内的所有像素点的值设置为 0, 将所述轮廓以外的所有像 素点的值都设置为 1。
[97] 输出时, 对简化为二值图像的深度图像进行编码处理, 进而, 将该二值图像和彩色 图像分别经编码处理后的码流输出, 通过传输网络传输至观看该视频图像的接收端。
[98] 此外, 对二值图像进行编码时可以应用 JBIG2编码器。 [99] ( 2 ) 获取所述所有分割点中每一个分割点的坐标信息; 将全部所述坐标信息进行 压缩处理, 输出所述压缩处理得到的全部所述坐标信息对应的压缩数据。
[100】该方式下, 获取所有分割点的坐标信息, 包括分割点对应像素点的空间坐标或者矢 量坐标, 空间坐标例如: (x, y ) 坐标。 然后, 将所有分割点的坐标信息集中在一起, 例 如: 表示为一个数据集。 将该包含所有分割点坐标信息的数据集进行压缩, 通过传输网 络传输至观看该视频图像的接收端。
[101】上述实施例主要描述了图像处理过程中图像发送侧的图像处理方法。 相应地, 本发 明还提供了一种图像处理方法, 主要针对图像处理过程中图像接收侧的图像处理方法。
[102】参见图 3, 为本发明提供的一个图像处理方法实施例的流程, 具体可以包括: [103】步骤 301、 接收视频图像帧的编码数据以及所述视频图像帧中前景图像的轮廓对应 的深度数据, 所述视频图像帧中, 根据所述前景图像的轮廓, 对所述前景图像的轮廓以 内的视频图像像素点进行第一编码, 对所述轮廓以外的视频图像像素点进行第二编码, 获得所述视频图像帧对应的编码数据, 其中, 所述第一编码的编码速率高于所述第二编 码的编码速率。 [104】该步骤中, 接收侧接收到图像发送侧发送的视频图像帧对应的编码数据以及所述视 频图像帧中前景图像的轮廓对应的深度数据。 视频图像发送侧已经通过图像分割, 得到 所述视频图像帧中前景图像的轮廓, 并通过该轮廓, 分离出视频图像帧中的 "前景" 和 "背景", 所述视频图像帧中, 所述轮廓以内的所有像素点构成 "前景" 图像, 而所述轮 廓以外的所有像素点构成 "背景" 图像。 [105】此外, 接收侧接收到视频图像帧对应的编码数据中, 已由视频图像发送侧对所述视 频图像帧中前景图像的轮廓以内的 "前景" 图像采用较高编码速率的编码方式, 对所述 视频图像帧中前景图像的轮廓以外的 "背景" 图像采用较低编码速率的编码方式。 通过 该编码方式, 能够减少编码过程中比特数的利用率, 降低视频图像传输过程中对网络带 宽的占用率, 并增强 "前景" 图像的图像质量。 [106】步骤 302、 根据所述深度数据, 对所述视频图像帧进行分割, 得到所述视频图像帧 中的前景图像。
[107】该步骤中, 接收侧可以对接收到的编码数据进行解码, 获得发送侧采集到的视频图 像, 根据接收到的深度数据, 可以对接收到的视频图像进行分割, 进而获得接收到的视 频图像中的前景图像。 这部分前景图像通常是视频用户较为关注的图像部分。 [108】步骤 303、 播放所述前景图像。
[109】该步骤中, 当分割出视频图像中的前景图像之后, 即可以进行所述前景图像的播放。 由于接收到的视频图像中除所述前景图像之外的背景图像往往不被用户关注, 可以不进 行这部分背景图像的播放。 [110】例如: 一般视频会议中, 与当前系统用户一起交谈的其他人都是该用户所关心的, 因此,这些人的图像即为"前景";除去这些人之外的该用户并不关注的图像即为"背景"。 通过本发明实施例, 在视频接收侧, 可以只播放用户较为关注的与当前系统用户一起交 谈的人的图像, 而不播放除去这些人之外的该用户并不关注的 "背景" 图像。
[111】对于浸入式会议系统, 在进行所述前景图像播放时, 可以将所述前景图像设置于视 频会议系统中会议界面的回放窗口上进行播放。
[112】本发明实施例中, 由于视频图像发送侧通过分割视频图像帧得到所述视频图像帧中 前景图像的轮廓, 根据所述前景图像的轮廓, 可以区分出所述视频图像帧中的 "前景" 图像和 "背景" 图像, 进而, 对 "前景" 图像和 "背景" 图像分别包括的视频图像像素 点进行不同方式的编码处理, 即: 对所述轮廓以内的 "前景" 图像采用较高编码速率的 编码方式, 对所述轮廓以外的 "背景" 图像采用较低编码速率的编码方式。 通过该编码 方式, 能够减少编码过程中比特数的利用率, 降低视频图像帧传输过程中对网络带宽的 占用率, 并增强 "前景" 图像的图像质量。 此外, 由于只传输所述轮廓对应的像素点的 深度数据, 并非传输深度图像中所有像素点对应的深度数据, 可以进一步降低视频图像 传输过程中对网络带宽的占用率。 [113】由于接收到的视频图像中除所述前景图像之外的背景图像往往不被用户关注, 为了 提升用户参加视频会议时浸入式的体验, 可以不进行这部分背景图像的播放, 而是在播 放所述前景图像的同时, 播放预置的背景图像或图片, 将所述背景图像或图片作为作数 前景图像的背景。 例如: 在浸入式会议系统的会议界面上, 与当前系统用户一起交谈的 对方用户的人像通常为 "前景" 图像, 在播放这部分 "前景" 图像时, 可以播放预置的 背景图像或图片, 将该预置的背景图像或图片作为会议界面的 "背景" 图像, 与会议界 面上显示的对方用户的人像共同进行播放。
[114】与本发明提供的图像处理方法实施例相对应, 本发明还提供了一种图像处理设备。 [115】如图 4所示, 为本发明提供的一种图像处理设备的实施例, 该设备具体可以包括:
[116】采集模块 401, 用于采集物体的视频图像和深度图像; [117】轮廓分割模块 402, 用于利用所述深度图像对同一时刻的所述视频图像对应的视频 图像帧进行分割, 得到所述视频图像帧中前景图像的轮廓;
[118】视频编码模块 403, 用于根据所述前景图像的轮廓, 对所述视频图像帧中所述前景 图像的轮廓以内的视频图像像素点进行第一编码, 对所述视频图像帧中所述轮廓以外的 视频图像像素点进行第二编码, 获得所述视频图像帧对应的编码数据, 其中, 所述第一 编码的编码速率高于所述第二编码的编码速率;
[119】第一输出模块 404, 用于输出所述视频图像帧对应的编码数据; [120】第二输出模块 405, 用于所述前景图像的轮廓对应的深度数据
[121】本发明实施例中, 通过采集模块, 采集物体的视频图像和深度图像, 通过图像处理 设备中的轮廓分割模块, 分割视频图像帧得到所述视频图像帧中前景图像的轮廓, 根据 所述轮廓, 可以区分出所述视频图像帧中的 "前景" 图像和 "背景" 图像, 进而, 通过 视频编码模块, 对 "前景" 图像和 "背景" 图像分别包括的视频图像像素点进行不同方 式的编码处理, 即: 对所述轮廓以内的 "前景" 图像采用较高编码速率的编码方式, 对 所述轮廓以外的 "背景" 图像采用较低编码速率的编码方式。 通过该编码方式, 能够减 少编码过程中比特数的利用率, 降低视频图像传输过程中对网络带宽的占用率, 并增强 "前景" 图像的图像质量。 此外, 由于只传输所述轮廓对应的像素点的深度数据, 并非 传输深度图像中所有像素点对应的深度数据, 可以进一步降低视频图像传输过程中对网 络带宽的占用率。
[122】在本发明提供的一个可行的实施例中, 如图 5所示, 所述轮廓分割模块 402, 具体 可以包括:
[123】像素对齐单元 501, 用于将所述视频图像帧的深度图像与视频图像进行像素对齐;
[124】深度差值计算单元 502, 计算所述深度图像上每个像素点与各个相邻像素点的深度 差值, 确定像素点中与所有所述相邻像素点的深度差值的方差大于预设阈值的像素点为 分割点; [125】分割点确定单元 503, 用于遍历所述深度图像上全部像素点, 确定出所有分割点;
[126】轮廓获取单元 504, 用于根据所述所有分割点, 得出所述视频图像中前景图像的轮 廓。
[127】在本发明提供的一个可行的实施例中, 如图 6所示, 所述第二输出模块 405, 具体 可以包括:
[128】二值图像简化单元 601, 用于根据所述轮廓, 将所述深度图像简化为二值图像; [129】二值图像编码单元 602, 用于将所述二值图像进行编码处理; [130】二值图像输出单元 603, 用于输出所述二值图像对应的编码数据。 [131】该实现方式下, 将所述轮廓以内的所有像素点的值设置为 0, 将所述轮廓以外的所 有像素点的值都设置为 1。 输出时, 对简化为二值图像的深度图像进行编码处理, 进而, 将该二值图像和彩色图像分别经编码处理后的码流输出, 通过传输网络传输至观看该视 频图像的接收端。
[132】在本发明提供的另一个可行的实施例中, 如图 7所示, 所述第二输出模块 405, 具 体可以包括:
[133】坐标获取单元 701, 用于获取所述所有分割点中每一个分割点的坐标信息; [134】压缩单元 702, 用于将全部所述坐标信息进行压缩处理;
[135】坐标输出单元 703, 用于输出所述压缩处理得到的全部所述坐标信息对应的压缩数 据。 [136】该实现方式下, 获取所有分割点的坐标信息, 包括分割点对应像素点的空间坐标或 者矢量坐标, 空间例如: (x, y ) 坐标。 然后, 将所有分割点的坐标信息集中在一起, 例 如: 表示为一个数据集。 将该包含所有分割点坐标信息的数据集进行压缩, 通过传输网 络传输至观看该视频图像的接收端。
[137】上述图像处理设备为图像处理过程中图像发送侧的相应设备, 本发明实施例中, 提 供了一种图像处理设备, 为图像处理过程中图像接收侧的相应设备。
[138】如图 8所示, 为本发明提供的一种图像处理设备的实施例, 该设备具体可以包括:
[139】接收模块 801, 用于接收视频图像帧的编码数据以及所述视频图像帧中前景图像的 轮廓对应的深度数据, 所述视频图像帧中, 根据所述前景图像的轮廓, 对所述前景图像 的轮廓以内的视频图像像素点进行第一编码, 对所述轮廓以外的视频图像像素点进行第 二编码, 获得所述视频图像帧对应的编码数据, 其中, 所述第一编码的编码速率高于所 述第二编码的编码速率;
[140】前景图像分割模块 802, 用于对所述视频图像帧进行分割, 得到所述视频图像帧中 的前景图像;
[141】前景播放模块 803, 用于播放所述前景图像。
[142】本发明实施例中, 由于视频图像发送侧通过分割视频图像帧得到所述视频图像帧中 前景图像的轮廓, 根据所述前景图像的轮廓, 可以区分出所述视频图像帧中的 "前景" 图像和 "背景" 图像, 进而, 对 "前景" 图像和 "背景" 图像分别包括的视频图像像素 点进行不同方式的编码处理, 即: 对所述轮廓以内的 "前景" 图像采用较高编码速率的 编码方式, 对所述轮廓以外的 "背景" 图像采用较低编码速率的编码方式。 通过该编码 方式, 能够减少编码过程中比特数的利用率, 降低视频图像帧传输过程中对网络带宽的 占用率, 并增强 "前景" 图像的图像质量。 此外, 由于只传输所述轮廓对应的像素点的 深度数据, 并非传输深度图像中所有像素点对应的深度数据, 可以进一步降低视频图像 传输过程中对网络带宽的占用率。
[143】在本发明提供的另一个实施例中, 如图 9所示, 所述图像处理设备还可以包括:
[144】背景播放模块 804, 用于在播放所述前景图像的同时, 播放预置的背景图像或图片, 将所述背景图像或图片作为作数前景图像的背景。 [145】由于接收到的视频图像中除所述前景图像之外的背景图像往往不被用户关注, 为了 提升用户参加视频会议时浸入式的体验, 本发明实施例中, 可以不进行这部分背景图像 的播放, 而是在播放所述前景图像的同时, 播放预置的背景图像或图片, 将所述背景图 像或图片作为作数前景图像的背景。
[146】相应地, 本发明还提供了一种图像处理系统。 该系统具体可以包括: 图像发送设备 和图像接收设备, 其中,
[147】所述图像发送设备, 用于采集物体的视频图像和深度图像; 利用所述深度图像对同 一时刻的所述视频图像对应的视频图像帧进行分割, 得到所述视频图像帧中前景图像的 轮廓; 根据所述前景图像的轮廓, 对所述视频图像帧中所述前景图像的轮廓以内的视频 图像像素点进行第一编码, 对所述视频图像帧中所述轮廓以外的视频图像像素点进行第 二编码, 获得所述视频图像帧对应的编码数据, 其中, 所述第一编码的编码速率高于所 述第二编码的编码速率; 输出所述视频图像帧对应的编码数据以及所述前景图像的轮廓 对应的深度数据;
[148】所述图像接收设备, 用于接收视频图像帧的编码数据以及所述视频图像帧中前景图 像的轮廓对应的深度数据, 根据所述深度数据, 对所述视频图像帧进行分割, 得到所述 视频图像帧中的前景图像; 播放所述前景图像。 [149】上述系统实施例中, 在图像发送设备一侧, 通过分割视频图像帧得到所述视频图像 帧中前景图像的轮廓, 根据所述前景图像的轮廓, 可以区分出所述视频图像帧中的 "前 景" 图像和 "背景" 图像, 进而, 对 "前景" 图像和 "背景" 图像分别包括的视频图像 像素点进行不同方式的编码处理, 即: 对所述轮廓以内的 "前景" 图像采用较高编码速 率的编码方式, 对所述轮廓以外的 "背景" 图像采用较低编码速率的编码方式。 通过该 编码方式, 能够减少编码过程中比特数的利用率, 降低视频图像帧传输过程中对网络带 宽的占用率, 并增强 "前景" 图像的图像质量。 此外, 由于只传输所述轮廓对应的像素 点的深度数据, 并非传输深度图像中所有像素点对应的深度数据, 可以进一步降低视频 图像传输过程中对网络带宽的占用率。 [150】此外, 上述系统中, 图像发送设备还可以执行图 2所示的执行流程以及前述两种输 出所述轮廓对应的深度数据的处理方法; 图像接收设备还可以执行图 3所示的执行流程, 并能够在播放所述前景图像的同时, 播放预置的背景图像或图片, 将所述背景图像或图 片作为作数前景图像的背景。
[151】下面通过一个具体应用场景, 对上述技术方案进行解释说明。 [152】如图 10所述的应用场景中, 图像发送设备 a侧的深度摄像头 1001用于捕获用户与 摄像头的距离的深度图像, 彩色摄像头 1002用于捕获用户的彩色视频图像, 获得当前视 频图像的视频图像帧; 通过轮廓分割模块 1003, 利用所述深度图像对该视频图像帧进行 分割, 得到所述视频图像帧中前景图像的轮廓; 根据所述轮廓, 将所述深度图像简化为 二值图像, 将所述二值图像利用 JBIG2编码器 1004进行编码处理, 同时, 通过基于 R0I 编码的视频编码器 1005实现利用所述前景图像的轮廓对所述视频图像帧进行 R0I编码; 两种编码数据通过网络 1006传输至图像接收设备 b侧的 JBIG2解码器 1007和 R0I解码 器 1008; 通过 JBIG2解码器获得所述视频图像帧中前景图像的轮廓, 进而, 通过前景图 像分割模块 1009, 将视频图像帧中的前景图像单独分割出来; 进一步, 可以通过背景播 放模块 1010, 实现在播放所述前景图像的同时, 播放预置的背景图像或图片, 将预置的 背景图像或图片作为作数前景图像的背景。
[153】本领域普通技术人员可以意识到, 结合本文中所公开的实施例描述的各示例的单元 及算法步骤, 能够以电子硬件、 或者计算机软件和电子硬件的结合来实现。 这些功能究 竟以硬件还是软件方式来执行, 取决于技术方案的特定应用和设计约束条件。 专业技术 人员可以对每个特定的应用来使用不同方法来实现所描述的功能, 但是这种实现不应认 为超出本发明的范围。
[154】所属领域的技术人员可以清楚地了解到, 为描述的方便和简洁, 上述描述的系统、 设备和单元的具体工作过程, 可以参考前述方法实施例中的对应过程, 在此不再赘述。
[155】在本申请所提供的几个实施例中, 应该理解到, 所揭露的系统、 设备和方法, 可以 通过其它的方式实现。 例如, 以上所描述的设备实施例仅仅是示意性的, 例如, 所述单 元的划分, 仅仅为一种逻辑功能划分, 实际实现时可以有另外的划分方式, 例如多个单 元或组件可以结合或者可以集成到另一个系统, 或一些特征可以忽略, 或不执行。 另一 点, 所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口, 设备 或单元的间接耦合或通信连接, 可以是电性, 机械或其它的形式。
[156】所述作为分离部件说明的单元可以是或者也可以不是物理上分开的, 作为单元显示 的部件可以是或者也可以不是物理单元, 即可以位于一个地方, 或者也可以分布到多个 网络单元上。 可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的 目的。
[157】另外, 在本发明各个实施例中的各功能单元可以集成在一个处理单元中, 也可以是 各个单元单独物理存在, 也可以两个或两个以上单元集成在一个单元中。
[158】所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时, 可以存 储在一个计算机可读取存储介质中。 基于这样的理解, 本发明的技术方案本质上或者说 对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来, 该 计算机软件产品存储在一个存储介质中, 包括若干指令用以使得一台计算机设备 (可以 是个人计算机, 服务器, 或者网络设备等) 或处理器 (processor ) 执行本发明各个实施 例所述方法的全部或部分步骤。 而前述的存储介质包括: U 盘、 移动硬盘、 只读存储器 ( ROM, Read-Only Memory ) 随机存取存储器 ( RAM, Random Access Memory ) 磁碟或 者光盘等各种可以存储程序代码的介质。
[159】以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限于此, 任何 熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易想到变化或替换, 都应 涵盖在本发明的保护范围之内。 因此, 本发明的保护范围应所述以权利要求的保护范围 为准。

Claims

权 利 要 求
1、 一种图像处理方法, 其特征在于, 包括:
采集物体的视频图像和深度图像;
利用所述深度图像对同一时刻的所述视频图像对应的视频图像帧进行分割, 得到所述视 频图像帧中前景图像的轮廓;
根据所述前景图像的轮廓, 对所述视频图像帧中所述前景图像的轮廓以内的视频图像像 素点进行第一编码, 对所述视频图像帧中所述轮廓以外的视频图像像素点进行第二编码, 获 得所述视频图像帧对应的编码数据, 其中, 所述第一编码的编码速率高于所述第二编码的编 码速率;
输出所述视频图像帧对应的编码数据以及所述前景图像的轮廓对应的深度数据。
2、 根据权利要求 1所述的方法, 其特征在于, 所述利用所述深度图像对同一时刻的所述 视频图像对应的视频图像帧进行分割, 得到所述视频图像帧中前景图像的轮廓, 包括:
将所述视频图像帧的深度图像与视频图像进行像素对齐;
计算所述深度图像上每个像素点与各个相邻像素点的深度差值, 确定像素点中与所有所 述相邻像素点的深度差值的方差大于预设阈值的像素点为分割点;
遍历所述深度图像上全部像素点, 确定出所有分割点;
根据所述所有分割点, 得出所述视频图像中前景图像的轮廓。
3、根据权利要求 1所述的方法,其特征在于,所述输出所述轮廓对应的深度数据,包括: 根据所述轮廓, 将所述深度图像简化为二值图像;
将所述二值图像进行编码处理, 输出所述二值图像对应的编码数据。
4、 根据权利要求 2所述的方法, 其特征在于, 所述输出所述前景图像的轮廓对应的深度 数据, 包括:
获取所述所有分割点中每一个分割点的坐标信息;
将全部所述坐标信息进行压缩处理, 输出所述压缩处理得到的全部所述坐标信息对应的 压缩数据。
5、 一种图像处理方法, 其特征在于, 包括:
接收视频图像帧的编码数据以及所述视频图像帧中前景图像的轮廓对应的深度数据, 所 述视频图像帧中, 根据所述前景图像的轮廓, 对所述前景图像的轮廓以内的视频图像像素点 进行第一编码, 对所述轮廓以外的视频图像像素点进行第二编码, 获得所述视频图像帧对应 的编码数据, 其中, 所述第一编码的编码速率高于所述第二编码的编码速率;
根据所述深度数据, 对所述视频图像帧进行分割, 得到所述视频图像帧中的前景图像; 播放所述前景图像。
6、 根据权利要求 5所述的方法, 其特征在于, 还包括: 在播放所述前景图像的同时, 播放预置的背景图像或图片, 将所述背景图像或图片作为 作数前景图像的背景。
7、 一种图像处理设备, 其特征在于, 包括:
采集模块, 用于采集物体的视频图像和深度图像;
轮廓分割模块, 用于利用所述深度图像对同一时刻的所述视频图像对应的视频图像帧进 行分割, 得到所述视频图像帧中前景图像的轮廓;
视频编码模块, 用于根据所述前景图像的轮廓, 对所述视频图像帧中所述前景图像的轮 廓以内的视频图像像素点进行第一编码, 对所述视频图像帧中所述轮廓以外的视频图像像素 点进行第二编码, 获得所述视频图像帧对应的编码数据, 其中, 所述第一编码的编码速率高 于所述第二编码的编码速率;
第一输出模块, 用于输出所述视频图像帧对应的编码数据;
第二输出模块, 用于所述前景图像的轮廓对应的深度数据。
8、 根据权利要求 7所述的设备, 其特征在于, 所述轮廓分割模块, 包括:
像素对齐单元, 用于将所述视频图像帧的深度图像与视频图像进行像素对齐; 深度差值计算单元,用于计算所述深度图像上每个像素点与各个相邻像素点的深度差值, 确定像素点中与所有所述相邻像素点的深度差值的方差大于预设阈值的像素点为分割点; 分割点确定单元, 用于遍历所述深度图像上全部像素点, 确定出所有分割点; 轮廓获取单元, 用于根据所述所有分割点, 得出所述视频图像中前景图像的轮廓。
9、 根据权利要求 7所述的设备, 其特征在于, 所述第二输出模块, 包括:
二值图像简化单元, 用于根据所述轮廓, 将所述深度图像简化为二值图像;
二值图像编码单元, 用于将所述二值图像进行编码处理;
二值图像输出单元, 用于输出所述二值图像对应的编码数据。
10、 根据权利要求 8所述的设备, 其特征在于, 所述第二输出模块, 包括- 坐标获取单元, 用于获取所述所有分割点中每一个分割点的坐标信息;
压缩单元, 用于将全部所述坐标信息进行压缩处理;
坐标输出单元, 用于输出所述压缩处理得到的全部所述坐标信息对应的压缩数据。
11、 一种图像处理设备, 其特征在于, 包括:
接收模块, 用于接收视频图像帧的编码数据以及所述视频图像帧中前景图像的轮廓对应 的深度数据, 所述视频图像帧中, 根据所述前景图像的轮廓, 对所述前景图像的轮廓以内的 视频图像像素点进行第一编码, 对所述轮廓以外的视频图像像素点进行第二编码, 获得所述 视频图像帧对应的编码数据, 其中, 所述第一编码的编码速率高于所述第二编码的编码速率; 前景图像分割模块, 用于根据所述深度数据, 对所述视频图像帧进行分割, 得到所述视 频图像帧中的前景图像;
前景播放模块, 用于播放所述前景图像。
12、 根据权利要求 11所述的设备, 其特征在于, 还包括:
背景播放模块, 用于在播放所述前景图像的同时, 播放预置的背景图像或图片, 背景图像或图片作为作数前景图像的背景。
13、 一种图像处理系统, 其特征在于, 包括如权利要求 7-10中任一项所述的图像处理设 备以及权利要求 11-12中任一项所述的图像处理设备。
PCT/CN2014/070138 2013-08-19 2014-01-06 一种图像处理方法及设备 WO2015024362A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP14838494.4A EP2999221A4 (en) 2013-08-19 2014-01-06 IMAGE PROCESSING AND DEVICE
JP2016526410A JP6283108B2 (ja) 2013-08-19 2014-01-06 画像処理方法及び装置
US14/972,222 US9392218B2 (en) 2013-08-19 2015-12-17 Image processing method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310362321.1 2013-08-19
CN201310362321.1A CN104427291B (zh) 2013-08-19 2013-08-19 一种图像处理方法及设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/972,222 Continuation US9392218B2 (en) 2013-08-19 2015-12-17 Image processing method and device

Publications (1)

Publication Number Publication Date
WO2015024362A1 true WO2015024362A1 (zh) 2015-02-26

Family

ID=52483014

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/070138 WO2015024362A1 (zh) 2013-08-19 2014-01-06 一种图像处理方法及设备

Country Status (5)

Country Link
US (1) US9392218B2 (zh)
EP (1) EP2999221A4 (zh)
JP (1) JP6283108B2 (zh)
CN (1) CN104427291B (zh)
WO (1) WO2015024362A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106911902A (zh) * 2017-03-15 2017-06-30 微鲸科技有限公司 视频图像传输方法、接收方法及装置
EP3389008A4 (en) * 2015-12-08 2018-11-21 Panasonic Intellectual Property Management Co., Ltd. Image recognition device and image recognition method

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9560103B2 (en) * 2013-06-26 2017-01-31 Echostar Technologies L.L.C. Custom video content
US9847082B2 (en) * 2013-08-23 2017-12-19 Honeywell International Inc. System for modifying speech recognition and beamforming using a depth image
JP6529360B2 (ja) * 2015-06-26 2019-06-12 キヤノン株式会社 画像処理装置、撮像装置、画像処理方法およびプログラム
CN107180407A (zh) * 2016-03-09 2017-09-19 杭州海康威视数字技术股份有限公司 一种图片处理方法及装置
CN107295360B (zh) * 2016-04-13 2020-08-18 成都鼎桥通信技术有限公司 视频传输方法及装置
US10373592B2 (en) * 2016-08-01 2019-08-06 Facebook Technologies, Llc Adaptive parameters in image regions based on eye tracking information
WO2018101080A1 (ja) * 2016-11-30 2018-06-07 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元モデル配信方法及び三次元モデル配信装置
WO2018123801A1 (ja) * 2016-12-28 2018-07-05 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元モデル配信方法、三次元モデル受信方法、三次元モデル配信装置及び三次元モデル受信装置
JP2018136896A (ja) * 2017-02-24 2018-08-30 キヤノン株式会社 情報処理装置、システム、情報処理方法、および物品の製造方法
CN107945192B (zh) * 2017-12-14 2021-10-22 北京信息科技大学 一种托盘纸箱垛型实时检测方法
CN107948650A (zh) * 2017-12-25 2018-04-20 横琴国际知识产权交易中心有限公司 一种视频编码方法及装置
KR102664392B1 (ko) 2018-06-28 2024-05-08 삼성전자주식회사 디스플레이 장치
KR102673817B1 (ko) 2018-07-30 2024-06-10 삼성전자주식회사 3차원 영상 표시 장치 및 영상 처리 방법
KR102546321B1 (ko) 2018-07-30 2023-06-21 삼성전자주식회사 3차원 영상 표시 장치 및 방법
CN109089097A (zh) * 2018-08-28 2018-12-25 恒信东方文化股份有限公司 一种基于vr图像处理的焦点对象选取方法
US11089279B2 (en) 2018-12-06 2021-08-10 Htc Corporation 3D image processing method, camera device, and non-transitory computer readable storage medium
CN111435551B (zh) * 2019-01-15 2023-01-13 华为技术有限公司 点云滤波方法、装置及存储介质
US11303877B2 (en) * 2019-08-13 2022-04-12 Avigilon Corporation Method and system for enhancing use of two-dimensional video analytics by using depth data
CN112395912B (zh) * 2019-08-14 2022-12-13 中移(苏州)软件技术有限公司 一种人脸分割方法、电子设备及计算机可读存储介质
CN113191210B (zh) * 2021-04-09 2023-08-29 杭州海康威视数字技术股份有限公司 一种图像处理方法、装置及设备
CN113613039B (zh) * 2021-08-09 2023-06-30 咪咕文化科技有限公司 视频传输方法、系统、计算设备及计算机存储介质
CN116193153B (zh) * 2023-04-19 2023-06-30 世优(北京)科技有限公司 直播数据的发送方法、装置及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101395923A (zh) * 2006-03-02 2009-03-25 汤姆森许可贸易公司 用于在图像信号编码中为图像中的像素块组确定比特分配的方法及设备
CN102158712A (zh) * 2011-03-22 2011-08-17 宁波大学 一种基于视觉的多视点视频信号编码方法
CN102156995A (zh) * 2011-04-21 2011-08-17 北京理工大学 一种运动相机下的视频运动前景分割方法
CN102999901A (zh) * 2012-10-17 2013-03-27 中国科学院计算技术研究所 基于深度传感器的在线视频分割后的处理方法及系统

Family Cites Families (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1013799A (ja) * 1996-06-19 1998-01-16 Mega Chips:Kk テレビ電話装置
US6055330A (en) * 1996-10-09 2000-04-25 The Trustees Of Columbia University In The City Of New York Methods and apparatus for performing digital image and video segmentation and compression using 3-D depth information
JPH11339007A (ja) * 1998-05-26 1999-12-10 Matsushita Electric Ind Co Ltd 画像処理装置
US20020051491A1 (en) * 1998-11-20 2002-05-02 Kiran Challapali Extraction of foreground information for video conference
FR2806570B1 (fr) * 2000-03-15 2002-05-17 Thomson Multimedia Sa Procede et dispositif de codage d'images video
JP2002016911A (ja) * 2000-06-29 2002-01-18 Toshiba Corp オブジェクト映像符号化装置
US7426304B2 (en) * 2004-09-15 2008-09-16 Hewlett-Packard Development Company, L.P. Method and device for three-dimensional graphics to two-dimensional video encoding
JP4670303B2 (ja) * 2004-10-06 2011-04-13 ソニー株式会社 画像処理方法及び画像処理装置
KR100708180B1 (ko) * 2005-09-22 2007-04-17 삼성전자주식회사 화상 압축 장치 및 방법
EP1806697B1 (en) * 2006-01-10 2016-08-10 Microsoft Technology Licensing, LLC Segmenting image elements
JP4679425B2 (ja) * 2006-04-20 2011-04-27 株式会社東芝 画像処理装置、画像処理方法およびプログラム
TWI381717B (zh) * 2008-03-31 2013-01-01 Univ Nat Taiwan 數位視訊動態目標物體分割處理方法及系統
US20120050475A1 (en) * 2009-05-01 2012-03-01 Dong Tian Reference picture lists for 3dv
GB2473247B (en) * 2009-09-04 2015-02-11 Sony Corp A method and apparatus for image alignment
WO2011121117A1 (en) * 2010-04-02 2011-10-06 Imec Virtual camera system
US8300938B2 (en) * 2010-04-09 2012-10-30 General Electric Company Methods for segmenting objects in images
US8818028B2 (en) * 2010-04-09 2014-08-26 Personify, Inc. Systems and methods for accurate user foreground video extraction
KR102472533B1 (ko) * 2010-08-11 2022-11-30 지이 비디오 컴프레션, 엘엘씨 멀티-뷰 신호 코덱
US8649592B2 (en) * 2010-08-30 2014-02-11 University Of Illinois At Urbana-Champaign System for background subtraction with 3D camera
ES2395102B1 (es) * 2010-10-01 2013-10-18 Telefónica, S.A. Metodo y sistema para segmentacion de primer plano de imagenes en tiempo real
JP2014140089A (ja) * 2011-05-10 2014-07-31 Sharp Corp 画像符号化装置、画像符号化方法、画像符号化プログラム、画像復号装置、画像復号方法及び画像復号プログラム
JP6094863B2 (ja) * 2011-07-01 2017-03-15 パナソニックIpマネジメント株式会社 画像処理装置、画像処理方法、プログラム、集積回路
KR101663394B1 (ko) * 2011-11-11 2016-10-06 지이 비디오 컴프레션, 엘엘씨 적응적 분할 코딩
CN103108197A (zh) * 2011-11-14 2013-05-15 辉达公司 一种用于3d视频无线显示的优先级压缩方法和系统
JP2013118468A (ja) * 2011-12-02 2013-06-13 Sony Corp 画像処理装置および画像処理方法
EP2706503A3 (en) * 2012-09-11 2017-08-30 Thomson Licensing Method and apparatus for bilayer image segmentation
CN103152569A (zh) * 2013-02-28 2013-06-12 哈尔滨工业大学 一种基于深度信息的视频感兴趣区域压缩方法
US9191643B2 (en) * 2013-04-15 2015-11-17 Microsoft Technology Licensing, Llc Mixing infrared and color component data point clouds
US9414016B2 (en) * 2013-12-31 2016-08-09 Personify, Inc. System and methods for persona identification using combined probability maps

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101395923A (zh) * 2006-03-02 2009-03-25 汤姆森许可贸易公司 用于在图像信号编码中为图像中的像素块组确定比特分配的方法及设备
CN102158712A (zh) * 2011-03-22 2011-08-17 宁波大学 一种基于视觉的多视点视频信号编码方法
CN102156995A (zh) * 2011-04-21 2011-08-17 北京理工大学 一种运动相机下的视频运动前景分割方法
CN102999901A (zh) * 2012-10-17 2013-03-27 中国科学院计算技术研究所 基于深度传感器的在线视频分割后的处理方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2999221A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3389008A4 (en) * 2015-12-08 2018-11-21 Panasonic Intellectual Property Management Co., Ltd. Image recognition device and image recognition method
US10339405B2 (en) 2015-12-08 2019-07-02 Panasonic Intellectual Property Management Co., Ltd. Image recognition device and image recognition method
CN106911902A (zh) * 2017-03-15 2017-06-30 微鲸科技有限公司 视频图像传输方法、接收方法及装置
CN106911902B (zh) * 2017-03-15 2020-01-07 微鲸科技有限公司 视频图像传输方法、接收方法及装置

Also Published As

Publication number Publication date
EP2999221A4 (en) 2016-06-22
US9392218B2 (en) 2016-07-12
JP6283108B2 (ja) 2018-02-21
EP2999221A1 (en) 2016-03-23
US20160105636A1 (en) 2016-04-14
CN104427291A (zh) 2015-03-18
JP2016527791A (ja) 2016-09-08
CN104427291B (zh) 2018-09-28

Similar Documents

Publication Publication Date Title
WO2015024362A1 (zh) 一种图像处理方法及设备
US11770558B2 (en) Stereoscopic video encoding and decoding methods and apparatus
KR100669837B1 (ko) 입체 비디오 코딩을 위한 포어그라운드 정보 추출 방법
KR20130129471A (ko) 관심 객체 기반 이미지 처리
EP1917642A2 (en) Video processing method and device for depth extraction
US10958950B2 (en) Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
EP3922032A1 (en) Quantization step parameter for point cloud compression
US20200304773A1 (en) Depth codec for 3d-video recording and streaming applications
US11095901B2 (en) Object manipulation video conference compression
KR20130030252A (ko) 입체적인 3차원 이미지들의 압축을 보존하는 낮은 대역폭 컨텐츠를 위한 방법 및 장치
Garus et al. Bypassing depth maps transmission for immersive video coding
US10735766B2 (en) Point cloud auxiliary information coding
EP3973710A1 (en) A method, an apparatus and a computer program product for volumetric video encoding and decoding
CN111406404B (zh) 获得视频文件的压缩方法、解压缩方法、系统及存储介质
JP3859989B2 (ja) 画像マッチング方法およびその方法を利用可能な画像処理方法と装置
CN116962743A (zh) 视频图像编码、抠图方法和装置及直播系统
WO2021248349A1 (en) Combining high-quality foreground with enhanced low-quality background
Ali et al. Depth image-based spatial error concealment for 3-D video transmission
JP2002077844A (ja) 画像伝送装置及び画像伝送方法並びに画像伝送プログラムを記録したコンピュータ読み取り可能な記録媒体
CN113810725A (zh) 视频处理方法、装置、存储介质及视频通讯终端
WO2023051705A1 (zh) 视频通讯方法及装置、电子设备、计算机可读介质
US20230306687A1 (en) Mesh zippering
WO2023180844A1 (en) Mesh zippering
Zhao et al. Scalable coding of depth images with synthesis-guided edge detection
CN115988189A (zh) 一种银行远程服务方法、装置及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14838494

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2014838494

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2016526410

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE