WO2017092072A1 - Distributed video encoding framework - Google Patents

Distributed video encoding framework Download PDF

Info

Publication number
WO2017092072A1
WO2017092072A1 PCT/CN2015/097220 CN2015097220W WO2017092072A1 WO 2017092072 A1 WO2017092072 A1 WO 2017092072A1 CN 2015097220 W CN2015097220 W CN 2015097220W WO 2017092072 A1 WO2017092072 A1 WO 2017092072A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
side information
key
sent
intra
Prior art date
Application number
PCT/CN2015/097220
Other languages
French (fr)
Chinese (zh)
Inventor
程德强
刘海
张国鹏
寇旗旗
张剑英
Original Assignee
中国矿业大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国矿业大学 filed Critical 中国矿业大学
Publication of WO2017092072A1 publication Critical patent/WO2017092072A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation

Definitions

  • the present invention relates to image processing techniques, and more particularly to a distributed video coding framework.
  • the wireless sensor network uses a large number of energy-constrained micro-nodes to collect, transmit and process mine environmental information, so that management dispatchers can understand the situation in real time.
  • the information obtained by the traditional sensor network can no longer meet the comprehensive needs of management dispatchers for information acquisition.
  • the wireless video sensor network has received a lot of attention from researchers because it can acquire rich multimedia information such as images and videos.
  • the efficient implementation of compression coding of multimedia information has become an important aspect of WVSN research.
  • the coding methods are different due to different node correlation models and working mechanisms; that is, there is no efficient and universal coding method for wireless video sensor networks in various applications. .
  • the random deployment of coding nodes cannot be realized.
  • the mine roadway has inherent characteristics such as severe electromagnetic interference and poor wireless channel quality, which makes the current coding methods not Suitable for use in high noise, unreliable channels.
  • the main object of the present invention is to provide a widely applicable distributed video coding framework that can be applied to a complex and harsh environment with high fault tolerance.
  • a distributed video coding framework includes: a base view point, an enhanced view point, a Wyner-Ziv encoder, a Wyner-Ziv decoder, a first intra encoder, a first intra decoder, a time side information generating module, and a second An intra encoder, a second intra decoder, a spatial side information generating module, a fusion module, and a reconstruction module; wherein
  • a base view for collecting the first environment video image, dividing the first environment video image into the first Wyner-Ziv frame and the first key frame according to the sequence number of the first environment video image, and the first Wyner-Ziv frame, the first The key frames are sent to the Wyner-Ziv encoder and the first intra encoder respectively.
  • the Wyner-Ziv encoder is configured to perform a discrete cosine transform on the first Wyner-Ziv frame transmitted by the base view to remove the inter-pixel correlation, and perform channel coding on the bit plane formed by quantizing the transform coefficients, and obtain the Wyner- Ziv coded frames are sent over the wireless channel to the Wyner-Ziv decoder.
  • Wyner-Ziv decoder for Wyner-Ziv sent to Wyner-Ziv encoder
  • the encoded frame is decoded and the Wyner-Ziv decoded frame is sent to the reconstruction module.
  • the first intra-frame encoder is configured to perform H.264 intra-frame coding on the first key frame sent by the base view, and send the obtained first key coded frame to the first intra-frame decoder through the wireless channel.
  • the first intra-frame decoder is configured to perform H.264 intra-frame decoding on the first key coded frame sent by the first intra-frame encoder, and send the obtained first key-decoded frame to the time-side information generating module.
  • a time side information generating module configured to perform preprocessing, block matching, and bidirectional motion interpolation on two consecutive first key decoding frames from the first intra decoder, and send the generated time side information frame to the fusion Module.
  • a second intra-frame encoder configured to perform H.264 intra-frame coding on the second key frame sent by the enhanced view, and send the obtained second key coded frame to the second intra-frame decoder through the wireless channel.
  • a second intra-frame decoder configured to perform H.264 intra-frame decoding on the second key coded frame sent by the second intra-frame encoder, and send the obtained second key-decoded frame to the spatial information generating module.
  • the spatial side information generating module is configured to perform motion estimation according to the second key decoding frame sent by the decoder in the second frame, and send the obtained initial spatial side information frame to the fusion module.
  • the fusion module is configured to map the initial spatial side information frame sent by the spatial side information generating module to the basic view point according to the correlation between the basic view point and the enhanced view point, obtain the mapped spatial side information frame, and adopt average interpolation After the information is fused by the time side information frame sent by the time side information generating module and the mapping space side information frame, The resulting fused information frame is sent to the reconstruction module.
  • the reconstruction module is configured to filter the fusion information frame sent by the fusion module, and perform image reconstruction according to the Wyner-Ziv decoding frame and the filtered fusion information frame sent by the Wyner-Ziv decoder.
  • the base view point is used as the main acquisition device
  • the enhanced view point is used as the auxiliary acquisition device; and, in the narrow mine roadway
  • the base view is placed in parallel with the enhanced view such that the corresponding core lines between the base view and the enhanced view point are parallel to each other and on the same image horizontal scan line.
  • the base view and the enhanced view are deployed in the mine roadway like the two eyes of the human.
  • the video image captured by the base view is divided into a Wyner-Ziv frame and a first key frame, and the Wyner-Ziv frame is sent to the monitoring room for decoding by encoding; the first key frame is also sent to the monitoring room for decoding by encoding, and is used for generating Time-side information; the second key frame code is extracted from the video image acquired by the enhanced viewpoint, sent to the monitoring room for decoding, and used to generate initial spatial side information corresponding to the enhanced viewpoint.
  • the initial spatial side information is mapped to the mapping spatial side information corresponding to the basic viewpoint according to the correlation between the basic viewpoint and the enhanced viewpoint; thus, the time is
  • the reconstructed module reconstructs and reproduces the video image in the mine roadway.
  • the distributed video coding framework of the present invention draws on the characteristics of the human visual system, and uses the video image acquired by the enhanced viewpoint adjacent to the basic viewpoint as a reference image, thereby avoiding the reconstruction of the reconstructed image in the monitoring room due to the incompleteness of the collected video information.
  • the distributed video coding framework of the present invention can adapt to harsh environments and has high fault tolerance and universal applicability.
  • FIG. 1 is a schematic diagram showing the structure of a distributed video coding framework according to the present invention.
  • FIG. 2 is a schematic diagram showing the structure of a time side information generating module according to the present invention.
  • FIG. 3 is a schematic diagram showing the structure of a space side information generating module according to the present invention.
  • FIG. 4 is a schematic view showing the structure of a fusion module according to the present invention.
  • the coding framework of the present invention includes: a base view 1, an enhancement view 2, a Wyner-Ziv encoder 3, a Wyner-Ziv decoder 4, a first intra encoder 5, and a first intra decoder. 6.
  • the base view 1 is configured to collect the first environment video image, and divide the first environment video image into the first Wyner-Ziv frame and the first key frame according to the sequence number of the first environment video image, and the first Wyner-Ziv frame, the first A key frame is sent to the Wyner-Ziv encoder 3 and the first intra encoder 5, respectively.
  • the enhancement view 2 is configured to collect the second environment video image, divide the second environment video image into the second Wyner-Ziv frame and the second key frame according to the sequence number of the second environment video image, and send the second key frame to the second Intra encoder 7.
  • the video frames constituting the picture group are usually divided into key frames and Wyner-Ziv frames according to the size of the picture group.
  • the number of frames of a video frame constituting a picture group is 2, a video frame numbered as an odd number is used as a key frame, and a video frame numbered as an even number is used as a Wyner-Ziv frame.
  • the odd-numbered video frames can also be used as Wyner-Ziv frames, and the even-numbered video frames can be used as key frames.
  • the Wyner-Ziv encoder 3 is configured to perform a discrete cosine transform for removing the inter-pixel correlation on the first Wyner-Ziv frame transmitted by the base view 1, and perform channel coding on the bit plane formed by quantizing the transform coefficients, and obtain the obtained
  • the Wyner-Ziv coded frame is sent to the Wyner-Ziv decoder 4 over the wireless channel.
  • the Wyner-Ziv decoder 4 is configured to decode the Wyner-Ziv encoded frame transmitted by the Wyner-Ziv encoder 3 and transmit the Wyner-Ziv decoded frame to the reconstruction module 12.
  • the first intra-frame encoder 5 is configured to perform H.264 intra-frame coding on the first key frame sent by the base view 1, and send the obtained first key coded frame to the first intra-frame decoder 6 through the wireless channel.
  • the first intra-frame decoder 6 is configured to perform H.264 intra-frame decoding on the first key coded frame sent by the first intra-frame encoder 5, and send the obtained first key-decoded frame to the time-side information generating module 9 .
  • the time side information generating module 9 is configured to sequentially perform preprocessing, block matching, and bidirectional motion interpolation on two consecutive first key decoding frames from the first intra decoder 6 to send the generated time side information frame. To the fusion module 11.
  • the second intra-frame encoder 7 is configured to perform H.264 intra-frame coding on the second key frame transmitted by the enhanced view point 2, and send the obtained second key coded frame to the second intra-frame decoder 8 through the wireless channel.
  • the second intra-frame decoder 8 is configured to perform H.264 intra-frame decoding on the second key coded frame sent by the second intra-frame encoder 7, and send the obtained second key-decoded frame to the spatial information generating module 10.
  • the spatial side information generating module 10 is configured to perform motion estimation according to the second key decoding frame sent by the decoder 8 in the second intraframe, and send the obtained initial spatial side information frame to the fusion module 11.
  • the fusion module 11 is configured to map the initial spatial side information sent by the spatial side information generating module 10 to the base view 1 according to the correlation between the base view 1 and the enhanced view 2, to obtain the mapped space side information, and adopt The average interpolation method performs information fusion on the time side information frame and the mapping space side information frame sent by the time side information generating module 9, and then sends the obtained fusion information frame to the reconstruction module 12.
  • the reconstruction module 12 is configured to filter the fusion information frame sent by the fusion module 11, and perform image reconstruction according to the Wyner-Ziv decoding frame and the filtered fusion information frame sent by the Wyner-Ziv decoder 4.
  • the image reconstruction based on the Wyner-Ziv decoded frame and the filtered fused information frame is prior art, and details are not described herein again.
  • the base view point is used as the main acquisition device
  • the enhanced view point is used as the auxiliary acquisition device
  • the basic The viewpoint is placed in parallel with the enhancement viewpoint such that the corresponding epipolar lines between the video images acquired by the base viewpoint and the enhancement viewpoint are parallel to each other and are located on the same image horizontal scanning line.
  • the video image captured by the base view is divided into a Wyner-Ziv frame and a first key frame, and the Wyner-Ziv frame is sent to the monitoring room for decoding by encoding; the first key frame is also sent to the monitoring room for decoding by encoding, and is used for generating Time-side information; the second key frame code is extracted from the video image acquired by the enhanced viewpoint, sent to the monitoring room for decoding, and used to generate initial spatial side information corresponding to the enhanced viewpoint.
  • the initial spatial side information is mapped to the mapping spatial side information corresponding to the basic viewpoint according to the correlation between the basic viewpoint and the enhanced viewpoint; thus, the time is
  • the reconstructed module reconstructs and reproduces the video image in the mine roadway.
  • the distributed video coding framework of the present invention draws on the characteristics of the human visual system, and uses the video image acquired by the enhanced viewpoint adjacent to the basic viewpoint as a reference image, thereby avoiding the reconstruction of the reconstructed image in the monitoring room due to the incompleteness of the collected video information.
  • the present invention also has high coding efficiency and decoding quality.
  • the time information generating module 9 of the present invention includes: a first pre-processing unit 91, a first block matching unit 92, and a time side information generating unit 93;
  • the first pre-processing unit 91 is configured to perform low-pass filtering processing on the two consecutive first key decoding frames from the first intra decoder 6 to divide the obtained two consecutive first key filtering frames into More than fifty basic macroblocks of size M ⁇ N are transmitted, and each basic macroblock is sent to the first block matching unit (92); wherein, M and N both represent the number of pixel points and are natural numbers.
  • the first block matching unit 92 is configured to perform a search according to MSE(i,j) ⁇ in each basic macroblock sent by the first pre-processing unit 91, and send the searched two matching basic macroblocks to each other.
  • the matching function ⁇ is a set value and is a real number; (i, j) represents a motion vector between two arbitrary basic macroblocks, and (x, y), (x+i, y+j) all represent pixel point coordinates; k (x, y) represents the pixel value of the current frame in two consecutive first key decoded frames at (x, y); f k-1 (x + i, y + j) represents two consecutive numbers The pixel value of the previous frame in a key decoded frame at (x+i, y+j).
  • the time side information generating unit 93 is configured to process the two mutually matching basic macroblocks sent by the first block matching unit 92 by using bidirectional motion interpolation to obtain a time side information frame. Transmitting the time side information frame Y 2n (p) to the fusion module 11; wherein Y 2n (p) represents a time side information frame, p represents a pixel coordinate in a time side information frame; X 2n-1 represents two mutually matching A basic macroblock of a preamble first critical filter frame belonging to two consecutive first key filter frames in the basic macroblock, X 2n+1 representing two consecutive first keys belonging to two mutually matching basic macroblocks The basic macroblock of the first key filtered frame in the filtered frame; MV f2n represents the forward motion vector, MV b2n represents the backward motion vector, and MV f2n and MV b2n are known.
  • FIG. 3 is a schematic diagram showing the structure of a space side information generating module according to the present invention.
  • the spatial information generating module 10 of the present invention includes: a second pre-processing unit 101, a second block matching unit 102, and a spatial side information generating unit 103;
  • a second pre-processing unit 101 configured to perform low-pass filtering processing on two consecutive second key decoding frames from the second intra decoder 8 to divide the obtained two consecutive second key filtering frames into Fifty or more enhanced macroblocks of size M ⁇ N are transmitted, and each enhanced macroblock is sent to the second block matching unit 102; wherein, M and N both represent the number of pixel points and are natural numbers.
  • the second matching unit 102 is configured to perform a search according to MSE(r, s) ⁇ ⁇ in each enhanced macroblock sent by the second pre-processing unit 101, and send the searched two matched enhanced macroblocks to each other.
  • the matching function ⁇ is a set value and is a real number; (r, s) represents a motion vector between two arbitrary enhanced macroblocks, and (x, y), (x+r, y+s) represent pixel point coordinates; l (x, y) represents the pixel value of the current frame in two consecutive second key decoded frames at (x, y); f l-1 (x + r, y + s) represents two consecutive numbers The pixel value of the previous frame in a key decoded frame at (x+r, y+s).
  • the spatial side information generating unit 103 is configured to process the two mutually matching enhanced macroblocks sent by the second block matching unit 102 by using bidirectional motion interpolation to obtain an initial spatial side information frame.
  • the initial spatial side information frame V 2m is sent to the fusion module 11; wherein V 2m (q) represents an initial spatial side information frame, q represents a pixel coordinate in an initial spatial side information frame; U 2m-1 represents two mutually matching a macroblock of a preamble first key filter frame belonging to two consecutive first key filter frames in a macroblock, U 2m+1 representing two consecutive first key filter frames in two mutually matching macroblocks
  • the macroblock of the first key filter frame is followed by MV f2m for the forward motion vector, MV b2m for the backward motion vector, and MV f2m and MV b2m are known.
  • the fusion module 11 of the present invention includes a third pre-processing unit 111, a feature point extraction unit 112, a basic matrix generation unit 113, a mapping unit 114, and an information fusion unit 115;
  • the third pre-processing unit 111 is configured to filter the time-side information frame sent by the time-side information generating module 9 and the initial spatial side information frame sent by the spatial side information generating module 10, and filter the obtained time-side information to filter frames and initials.
  • the spatial side information filtering frame is sent to the basic matrix generating unit 112, and the temporal side information filtering frame and the initial spatial side information filtering frame are respectively sent to the fusion unit 114 and the mapping unit 113.
  • the feature point extracting unit 112 is configured to acquire the brightness I (x, y) of each pixel point corresponding to the temporal side information filtering frame and the initial spatial side information filtering frame sent by the third preprocessing unit 111 in the horizontal direction and the vertical direction.
  • I'(x, y) gradients are as follows:
  • the basic autocorrelation matrix M and the enhanced autocorrelation matrix are constructed according to the above gradient correspondence.
  • a basic matrix generating unit configured to acquire, according to each feature point sent by the feature point extracting unit 112 and pixel coordinates corresponding to each feature point, a self between the base view point (1) and the enhanced view point (2) Correlation coefficient CC:
  • the size is (2m+1) ⁇ centered on (x 1 , y 1 ), (x 2 , y 2 ), (x 1 ', y 1 '), (x 2 ', y 2 '), respectively.
  • 6 sets of pre-matching points are extracted as 6 sets of samples; construct a linear equation system: Where m is a natural number, (a, b), (a', b') respectively represent pixel points in the image acquired by the base view, and pixels in the image acquired by the enhanced view; h 1 , h 2 , h 3 respectively Represents three vectors;
  • the mapping unit 114 maps the initial spatial side information filtering frame to the base view 1 by the base matrix F transmitted by the base matrix generating unit 113, and transmits the obtained mapped spatial side information frame to the information fusion unit 15.
  • the information fusion unit 115 is configured to use the average interpolation method to fuse the time side information frame sent by the third preprocessing unit 111 with the mapping space side information frame sent by the mapping unit 114, and send the obtained fusion information frame to the heavy Construct module 12.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A distributed video encoding framework, comprising: a basic viewpoint, an enhanced viewpoint, a Wyner-Ziv encoder, a Wyner-Ziv decoder, a first intra-frame encoder, a first intra-frame decoder, a time side information generation module, a second intra-frame encoder, a second intra-frame decoder, a space side information generation module, a fusion module and a reconstruction module. The basic viewpoint and the enhanced viewpoint are collection devices. The Wyner-Ziv encoder and decoder, the first intra-frame encoder and decoder and the second intra-frame encoder and decoder respectively encode and decode a first Wyner-Ziv frame, a first key frame and a second key frame. The time side information generation module and the space side information generation module respectively generate a time side information frame and a space side information frame. After the fusion module fuses the time side information frame and the space side information frame, the reconstruction module reconstructs an image. The method can be adapted to a severe and complex environment, has relatively high fault tolerance and universal applicability, and can be widely applied in the field of mining industry.

Description

一种分布式视频编码框架A distributed video coding framework 技术领域Technical field
本发明涉及图像处理技术,特别是涉及一种分布式视频编码框架。The present invention relates to image processing techniques, and more particularly to a distributed video coding framework.
背景技术Background technique
在环境复杂、恶劣的矿井中,无线传感网络(WSN,wireless sensor network)利用大量能量受限的微型节点采集、传输与处理矿井环境信息,使管理调度人员实时了解现场情况。但是,在煤炭安全高效生产或矿井灾难发生后的救援工作中,由该传统的传感器网络所获取的信息已无法满足管理调度人员对信息获取的全面需求。目前,无线视频传感网络(WVSN,wireless video sensor network)由于能获取图像、视频等丰富的多媒体信息,故得到了研究人员的大量关注。In a complex and harsh mine, the wireless sensor network (WSN) uses a large number of energy-constrained micro-nodes to collect, transmit and process mine environmental information, so that management dispatchers can understand the situation in real time. However, in the rescue work after the safe and efficient production of coal or the occurrence of mine disasters, the information obtained by the traditional sensor network can no longer meet the comprehensive needs of management dispatchers for information acquisition. At present, the wireless video sensor network (WVSN) has received a lot of attention from researchers because it can acquire rich multimedia information such as images and videos.
在无线视频传感网络中,因为其传输信息主要涉及音频信息或视频信息,而单个传感器节点的存储、处理能力又严重受限,故高效实现多媒体信息的压缩编码成为WVSN研究的一个重要方面。面向不同应用场合的无线视频传感器网络中,由于节点相关模型和工作机制不同,故其编码方法也不同;也就是说,没有一个高效普适的编码方法适用各种不同应用场合的无线视频传感器网络。尤其是,在矿井巷道狭长、重型机电设备通过频繁的情况下,无法实现编码节点的随机部署;而且,矿井巷道存在严重的电磁干扰、无线信道质量差等固有特点,使得当前的编码方法也不适宜应用于高噪声、不可靠的信道中。 In the wireless video sensor network, because the transmission information mainly involves audio information or video information, and the storage and processing capabilities of a single sensor node are severely limited, the efficient implementation of compression coding of multimedia information has become an important aspect of WVSN research. In wireless video sensor networks for different applications, the coding methods are different due to different node correlation models and working mechanisms; that is, there is no efficient and universal coding method for wireless video sensor networks in various applications. . In particular, in the case of long and narrow mine roadway and heavy-duty electromechanical equipment, the random deployment of coding nodes cannot be realized. Moreover, the mine roadway has inherent characteristics such as severe electromagnetic interference and poor wireless channel quality, which makes the current coding methods not Suitable for use in high noise, unreliable channels.
由此可见,现有技术中,尚无一种能适用于复杂恶劣环境的高容错性、普遍适用的分布式视频编码框架。Thus, in the prior art, there is no widely applicable distributed video coding framework that can be applied to complex and harsh environments with high fault tolerance.
发明内容Summary of the invention
有鉴于此,本发明的主要目的在于提供一种能适用于复杂恶劣环境的高容错性、普遍适用的分布式视频编码框架。In view of this, the main object of the present invention is to provide a widely applicable distributed video coding framework that can be applied to a complex and harsh environment with high fault tolerance.
为了达到上述目的,本发明提出的技术方案为:In order to achieve the above object, the technical solution proposed by the present invention is:
一种分布式视频编码框架,包括:基本视点、增强视点、Wyner-Ziv编码器、Wyner-Ziv解码器、第一帧内编码器、第一帧内解码器、时间边信息生成模块、第二帧内编码器、第二帧内解码器、空间边信息生成模块、融合模块、重构模块;其中,A distributed video coding framework includes: a base view point, an enhanced view point, a Wyner-Ziv encoder, a Wyner-Ziv decoder, a first intra encoder, a first intra decoder, a time side information generating module, and a second An intra encoder, a second intra decoder, a spatial side information generating module, a fusion module, and a reconstruction module; wherein
基本视点,用于采集第一环境视频图像,根据第一环境视频图像的序号将第一环境视频图像分为第一Wyner-Ziv帧与第一关键帧,将第一Wyner-Ziv帧、第一关键帧分别发送至Wyner-Ziv编码器、第一帧内编码器。a base view for collecting the first environment video image, dividing the first environment video image into the first Wyner-Ziv frame and the first key frame according to the sequence number of the first environment video image, and the first Wyner-Ziv frame, the first The key frames are sent to the Wyner-Ziv encoder and the first intra encoder respectively.
增强视点,用于采集第二环境视频图像,根据第二环境视频图像的序号将第二环境视频图像分为第二Wyner-Ziv帧与第二关键帧,将第二关键帧发送至第二帧内编码器。And enhancing the viewpoint for collecting the second environment video image, dividing the second environment video image into the second Wyner-Ziv frame and the second key frame according to the sequence number of the second environment video image, and sending the second key frame to the second frame Inner encoder.
Wyner-Ziv编码器,用于对基本视点发送的第一Wyner-Ziv帧进行去除像素间相关性的离散余弦变换,对将变换系数量化后形成的位平面进行信道编码,并将得到的Wyner-Ziv编码帧通过无线信道发送至Wyner-Ziv解码器。The Wyner-Ziv encoder is configured to perform a discrete cosine transform on the first Wyner-Ziv frame transmitted by the base view to remove the inter-pixel correlation, and perform channel coding on the bit plane formed by quantizing the transform coefficients, and obtain the Wyner- Ziv coded frames are sent over the wireless channel to the Wyner-Ziv decoder.
Wyner-Ziv解码器,用于对Wyner-Ziv编码器发送的Wyner-Ziv 编码帧进行解码,并将Wyner-Ziv解码帧发送至重构模块。Wyner-Ziv decoder for Wyner-Ziv sent to Wyner-Ziv encoder The encoded frame is decoded and the Wyner-Ziv decoded frame is sent to the reconstruction module.
第一帧内编码器,用于对基本视点发送的第一关键帧进行H.264帧内编码,并将得到的第一关键编码帧通过无线信道发送至第一帧内解码器。The first intra-frame encoder is configured to perform H.264 intra-frame coding on the first key frame sent by the base view, and send the obtained first key coded frame to the first intra-frame decoder through the wireless channel.
第一帧内解码器,用于对第一帧内编码器发送的第一关键编码帧进行H.264帧内解码,并将得到的第一关键解码帧发送至时间边信息生成模块。The first intra-frame decoder is configured to perform H.264 intra-frame decoding on the first key coded frame sent by the first intra-frame encoder, and send the obtained first key-decoded frame to the time-side information generating module.
时间边信息生成模块,用于对来自第一帧内解码器的两个连续的第一关键解码帧依次进行预处理、块匹配、双向运动内插后,将生成的时间边信息帧发送至融合模块。a time side information generating module, configured to perform preprocessing, block matching, and bidirectional motion interpolation on two consecutive first key decoding frames from the first intra decoder, and send the generated time side information frame to the fusion Module.
第二帧内编码器,用于对增强视点发送的第二关键帧进行H.264帧内编码,并将得到的第二关键编码帧通过无线信道发送至第二帧内解码器。And a second intra-frame encoder, configured to perform H.264 intra-frame coding on the second key frame sent by the enhanced view, and send the obtained second key coded frame to the second intra-frame decoder through the wireless channel.
第二帧内解码器,用于对第二帧内编码器发送的第二关键编码帧进行H.264帧内解码,并将得到的第二关键解码帧发送至空间信息生成模块。And a second intra-frame decoder, configured to perform H.264 intra-frame decoding on the second key coded frame sent by the second intra-frame encoder, and send the obtained second key-decoded frame to the spatial information generating module.
空间边信息生成模块,用于根据第二帧内解码器发送的第二关键解码帧进行运动估计,将得到的初始空间边信息帧发送至融合模块。The spatial side information generating module is configured to perform motion estimation according to the second key decoding frame sent by the decoder in the second frame, and send the obtained initial spatial side information frame to the fusion module.
融合模块,用于根据基本视点与增强视点之间的相关性,通过基础矩阵将空间边信息生成模块发送的初始空间边信息帧映射到基本视点,得到映射空间边信息帧,并采用平均内插法对时间边信息生成模块发送的时间边信息帧与映射空间边信息帧进行信息融合后,将得 到的融合信息帧发送至重构模块。The fusion module is configured to map the initial spatial side information frame sent by the spatial side information generating module to the basic view point according to the correlation between the basic view point and the enhanced view point, obtain the mapped spatial side information frame, and adopt average interpolation After the information is fused by the time side information frame sent by the time side information generating module and the mapping space side information frame, The resulting fused information frame is sent to the reconstruction module.
重构模块,用于对融合模块发送的融合信息帧进行滤波,并根据Wyner-Ziv解码器发送的Wyner-Ziv解码帧、经过滤波的融合信息帧进行图像重建。The reconstruction module is configured to filter the fusion information frame sent by the fusion module, and perform image reconstruction according to the Wyner-Ziv decoding frame and the filtered fusion information frame sent by the Wyner-Ziv decoder.
综上所述,本发明所述分布式视频编码框架中,由基本视点与增强视点同时采集视频图像,并且由基本视点作为主采集设备,增强视点作为辅助采集设备;而且,在狭窄的矿井巷道中,基本视点与增强视点平行放置,使得基本视点和增强视点所采集的视频图像之间的对应核线相互平行,且位于相同的图像水平扫描线上。由此,基本视点与增强视点象人类的两只眼睛一样,被部署在矿井巷道之中。基本视点所采集视频图像被分为Wyner-Ziv帧与第一关键帧,Wyner-Ziv帧通过编码发送至监控室进行解码;第一关键帧通过编码也被发送至监控室解码后,用于生成时间边信息;将从增强视点所采集视频图像中抽取第二关键帧编码后发送至监控室解码,并用于生成与增强视点对应的初始空间边信息。时间边信息与初始空间边信息在融合模块中进行预处理后,根据基本视点与增强视点之间的相关性,将初始空间边信息映射为与基本视点对应的映射空间边信息;这样,对时间边信息与映射空间边信息进行融合后,由重构模块对矿井巷道中的视频图像进行重构再现。本发明所述分布式视频编码框架借鉴了人眼视觉系统的特点,将基本视点邻近的增强视点所采集视频图像作为参考图像,避免了监控室内重建图像因为采集视频信息的不完备而导致重建图像质量差的问题;另外,由于本发明所述分布式视频编码框架中,将 基本视点所采集视频图像分为Wyner-Ziv帧与第一关键帧后,分别对它们进行编码与解码,同时仅对从增强视点所采集视频图像中提取的第二关键帧进行编码与解码,因此本发明还具有较高的编码效率与解码质量。综上所述,本发明所述分布式视频编码框架能适应于恶劣环境,具有较高的容错率与普遍适用性。In summary, in the distributed video coding framework of the present invention, video images are simultaneously acquired by the base view point and the enhanced view point, and the base view point is used as the main acquisition device, and the enhanced view point is used as the auxiliary acquisition device; and, in the narrow mine roadway The base view is placed in parallel with the enhanced view such that the corresponding core lines between the base view and the enhanced view point are parallel to each other and on the same image horizontal scan line. Thus, the base view and the enhanced view are deployed in the mine roadway like the two eyes of the human. The video image captured by the base view is divided into a Wyner-Ziv frame and a first key frame, and the Wyner-Ziv frame is sent to the monitoring room for decoding by encoding; the first key frame is also sent to the monitoring room for decoding by encoding, and is used for generating Time-side information; the second key frame code is extracted from the video image acquired by the enhanced viewpoint, sent to the monitoring room for decoding, and used to generate initial spatial side information corresponding to the enhanced viewpoint. After the temporal side information and the initial spatial side information are preprocessed in the fusion module, the initial spatial side information is mapped to the mapping spatial side information corresponding to the basic viewpoint according to the correlation between the basic viewpoint and the enhanced viewpoint; thus, the time is After the side information is merged with the mapping space side information, the reconstructed module reconstructs and reproduces the video image in the mine roadway. The distributed video coding framework of the present invention draws on the characteristics of the human visual system, and uses the video image acquired by the enhanced viewpoint adjacent to the basic viewpoint as a reference image, thereby avoiding the reconstruction of the reconstructed image in the monitoring room due to the incompleteness of the collected video information. a problem of poor quality; in addition, due to the distributed video coding framework of the present invention, The video image captured by the base view is divided into Wyner-Ziv frames and the first key frame, and then respectively encoded and decoded, and only the second key frame extracted from the video image acquired by the enhanced view is encoded and decoded. The invention also has high coding efficiency and decoding quality. In summary, the distributed video coding framework of the present invention can adapt to harsh environments and has high fault tolerance and universal applicability.
附图说明DRAWINGS
图1是本发明所述分布式视频编码框架的组成结构示意图。1 is a schematic diagram showing the structure of a distributed video coding framework according to the present invention.
图2是本发明所述时间边信息生成模块的组成结构示意图。2 is a schematic diagram showing the structure of a time side information generating module according to the present invention.
图3是本发明所述空间边信息生成模块的组成结构示意图。FIG. 3 is a schematic diagram showing the structure of a space side information generating module according to the present invention.
图4是本发明所述融合模块的组成结构示意图。4 is a schematic view showing the structure of a fusion module according to the present invention.
具体实施方式detailed description
为使本发明的目的、技术方案和优点更加清楚,下面将结合附图及具体实施例对本发明作进一步地详细描述。The present invention will be further described in detail below with reference to the drawings and specific embodiments.
图1是本发明所述分布式视频编码框架的组成结构示意图。如图1所示,本发明所述编码框架包括:基本视点1、增强视点2、Wyner-Ziv编码器3、Wyner-Ziv解码器4、第一帧内编码器5、第一帧内解码器6、时间边信息生成模块9、第二帧内编码器7、第二帧内解码器8、空间边信息生成模块10、融合模块11、重构模块12;其中,1 is a schematic diagram showing the structure of a distributed video coding framework according to the present invention. As shown in FIG. 1, the coding framework of the present invention includes: a base view 1, an enhancement view 2, a Wyner-Ziv encoder 3, a Wyner-Ziv decoder 4, a first intra encoder 5, and a first intra decoder. 6. The time side information generating module 9, the second intra encoder 7, the second intra decoder 8, the spatial side information generating module 10, the fusion module 11, and the reconstruction module 12;
基本视点1,用于采集第一环境视频图像,根据第一环境视频图像的序号将第一环境视频图像分为第一Wyner-Ziv帧与第一关键帧,将第一Wyner-Ziv帧、第一关键帧分别发送至Wyner-Ziv编码器3、第一帧内编码器5。 The base view 1 is configured to collect the first environment video image, and divide the first environment video image into the first Wyner-Ziv frame and the first key frame according to the sequence number of the first environment video image, and the first Wyner-Ziv frame, the first A key frame is sent to the Wyner-Ziv encoder 3 and the first intra encoder 5, respectively.
增强视点2,用于采集第二环境视频图像,根据第二环境视频图像的序号将第二环境视频图像分为第二Wyner-Ziv帧与第二关键帧,将第二关键帧发送至第二帧内编码器7。The enhancement view 2 is configured to collect the second environment video image, divide the second environment video image into the second Wyner-Ziv frame and the second key frame according to the sequence number of the second environment video image, and send the second key frame to the second Intra encoder 7.
实际应用中,基本视点1是主要采集设备;增强视点2是辅助采集设备,比如1帧/1秒或1帧/2秒。对于基本视点1与增强视点2采集的画面组,通常根据画面组的大小将组成画面组的视频帧分为关键帧与Wyner-Ziv帧。一般情况下,构成画面组的视频帧的帧数为2,将编号为奇数的视频帧作为关键帧,将编号为偶数的视频帧作为Wyner-Ziv帧。实际应用中,也可以将编号为奇数的视频帧作为Wyner-Ziv帧,将编号为偶数的视频帧作为关键帧。In practical applications, the base view 1 is the main acquisition device; the enhanced view 2 is the auxiliary acquisition device, such as 1 frame / 1 second or 1 frame / 2 seconds. For the group of pictures acquired by the base view 1 and the enhanced view 2, the video frames constituting the picture group are usually divided into key frames and Wyner-Ziv frames according to the size of the picture group. In general, the number of frames of a video frame constituting a picture group is 2, a video frame numbered as an odd number is used as a key frame, and a video frame numbered as an even number is used as a Wyner-Ziv frame. In practical applications, the odd-numbered video frames can also be used as Wyner-Ziv frames, and the even-numbered video frames can be used as key frames.
Wyner-Ziv编码器3,用于对基本视点1发送的第一Wyner-Ziv帧进行去除像素间相关性的离散余弦变换,对将变换系数量化后形成的位平面进行信道编码,并将得到的Wyner-Ziv编码帧通过无线信道发送至Wyner-Ziv解码器4。The Wyner-Ziv encoder 3 is configured to perform a discrete cosine transform for removing the inter-pixel correlation on the first Wyner-Ziv frame transmitted by the base view 1, and perform channel coding on the bit plane formed by quantizing the transform coefficients, and obtain the obtained The Wyner-Ziv coded frame is sent to the Wyner-Ziv decoder 4 over the wireless channel.
Wyner-Ziv解码器4,用于对Wyner-Ziv编码器3发送的Wyner-Ziv编码帧进行解码,并将Wyner-Ziv解码帧发送至重构模块12。The Wyner-Ziv decoder 4 is configured to decode the Wyner-Ziv encoded frame transmitted by the Wyner-Ziv encoder 3 and transmit the Wyner-Ziv decoded frame to the reconstruction module 12.
第一帧内编码器5,用于对基本视点1发送的第一关键帧进行H.264帧内编码,并将得到的第一关键编码帧通过无线信道发送至第一帧内解码器6。The first intra-frame encoder 5 is configured to perform H.264 intra-frame coding on the first key frame sent by the base view 1, and send the obtained first key coded frame to the first intra-frame decoder 6 through the wireless channel.
第一帧内解码器6,用于对第一帧内编码器5发送的第一关键编码帧进行H.264帧内解码,并将得到的第一关键解码帧发送至时间边信息生成模块9。 The first intra-frame decoder 6 is configured to perform H.264 intra-frame decoding on the first key coded frame sent by the first intra-frame encoder 5, and send the obtained first key-decoded frame to the time-side information generating module 9 .
时间边信息生成模块9,用于对来自第一帧内解码器6的两个连续的第一关键解码帧依次进行预处理、块匹配、双向运动内插后,将生成的时间边信息帧发送至融合模块11。The time side information generating module 9 is configured to sequentially perform preprocessing, block matching, and bidirectional motion interpolation on two consecutive first key decoding frames from the first intra decoder 6 to send the generated time side information frame. To the fusion module 11.
第二帧内编码器7,用于对增强视点2发送的第二关键帧进行H.264帧内编码,并将得到的第二关键编码帧通过无线信道发送至第二帧内解码器8。The second intra-frame encoder 7 is configured to perform H.264 intra-frame coding on the second key frame transmitted by the enhanced view point 2, and send the obtained second key coded frame to the second intra-frame decoder 8 through the wireless channel.
第二帧内解码器8,用于对第二帧内编码器7发送的第二关键编码帧进行H.264帧内解码,并将得到的第二关键解码帧发送至空间信息生成模块10。The second intra-frame decoder 8 is configured to perform H.264 intra-frame decoding on the second key coded frame sent by the second intra-frame encoder 7, and send the obtained second key-decoded frame to the spatial information generating module 10.
空间边信息生成模块10,用于根据第二帧内解码器8发送的第二关键解码帧进行运动估计,将得到的初始空间边信息帧发送至融合模块11。The spatial side information generating module 10 is configured to perform motion estimation according to the second key decoding frame sent by the decoder 8 in the second intraframe, and send the obtained initial spatial side information frame to the fusion module 11.
融合模块11,用于根据基本视点1与增强视点2之间的相关性,通过基础矩阵将空间边信息生成模块10发送的初始空间边信息映射到基本视点1,得到映射空间边信息,并采用平均内插法对时间边信息生成模块9发送的时间边信息帧与映射空间边信息帧进行信息融合后,将得到的融合信息帧发送至重构模块12。The fusion module 11 is configured to map the initial spatial side information sent by the spatial side information generating module 10 to the base view 1 according to the correlation between the base view 1 and the enhanced view 2, to obtain the mapped space side information, and adopt The average interpolation method performs information fusion on the time side information frame and the mapping space side information frame sent by the time side information generating module 9, and then sends the obtained fusion information frame to the reconstruction module 12.
重构模块12,用于对融合模块11发送的融合信息帧进行滤波,并根据Wyner-Ziv解码器4发送的Wyner-Ziv解码帧、经过滤波的融合信息帧进行图像重建。The reconstruction module 12 is configured to filter the fusion information frame sent by the fusion module 11, and perform image reconstruction according to the Wyner-Ziv decoding frame and the filtered fusion information frame sent by the Wyner-Ziv decoder 4.
本发明中,根据Wyner-Ziv解码帧、经过滤波的融合信息帧进行图像重建为现有技术,此处不再赘述。 In the present invention, the image reconstruction based on the Wyner-Ziv decoded frame and the filtered fused information frame is prior art, and details are not described herein again.
总之,本发明所述分布式视频编码框架中,由基本视点与增强视点同时采集视频图像,并且由基本视点作为主采集设备,增强视点作为辅助采集设备;而且,在狭窄的矿井巷道中,基本视点与增强视点平行放置,使得基本视点和增强视点所采集的视频图像之间的对应核线相互平行,且位于相同的图像水平扫描线上。由此,基本视点与增强视点象人类的两只眼睛一样,被部署在矿井巷道之中。基本视点所采集视频图像被分为Wyner-Ziv帧与第一关键帧,Wyner-Ziv帧通过编码发送至监控室进行解码;第一关键帧通过编码也被发送至监控室解码后,用于生成时间边信息;将从增强视点所采集视频图像中抽取第二关键帧编码后发送至监控室解码,并用于生成与增强视点对应的初始空间边信息。时间边信息与初始空间边信息在融合模块中进行预处理后,根据基本视点与增强视点之间的相关性,将初始空间边信息映射为与基本视点对应的映射空间边信息;这样,对时间边信息与映射空间边信息进行融合后,由重构模块对矿井巷道中的视频图像进行重构再现。本发明所述分布式视频编码框架借鉴了人眼视觉系统的特点,将基本视点邻近的增强视点所采集视频图像作为参考图像,避免了监控室内重建图像因为采集视频信息的不完备而导致重建图像质量差的问题;另外,由于本发明所述分布式视频编码框架中,将基本视点所采集视频图像分为Wyner-Ziv帧与第一关键帧后,分别对它们进行编码与解码,同时仅对从增强视点所采集视频图像中提取的第二关键帧进行编码与解码,因此本发明还具有较高的编码效率与解码质量。 In summary, in the distributed video coding framework of the present invention, video images are simultaneously acquired by the base view point and the enhanced view point, and the base view point is used as the main acquisition device, and the enhanced view point is used as the auxiliary acquisition device; and, in the narrow mine roadway, the basic The viewpoint is placed in parallel with the enhancement viewpoint such that the corresponding epipolar lines between the video images acquired by the base viewpoint and the enhancement viewpoint are parallel to each other and are located on the same image horizontal scanning line. Thus, the base view and the enhanced view are deployed in the mine roadway like the two eyes of the human. The video image captured by the base view is divided into a Wyner-Ziv frame and a first key frame, and the Wyner-Ziv frame is sent to the monitoring room for decoding by encoding; the first key frame is also sent to the monitoring room for decoding by encoding, and is used for generating Time-side information; the second key frame code is extracted from the video image acquired by the enhanced viewpoint, sent to the monitoring room for decoding, and used to generate initial spatial side information corresponding to the enhanced viewpoint. After the temporal side information and the initial spatial side information are preprocessed in the fusion module, the initial spatial side information is mapped to the mapping spatial side information corresponding to the basic viewpoint according to the correlation between the basic viewpoint and the enhanced viewpoint; thus, the time is After the side information is merged with the mapping space side information, the reconstructed module reconstructs and reproduces the video image in the mine roadway. The distributed video coding framework of the present invention draws on the characteristics of the human visual system, and uses the video image acquired by the enhanced viewpoint adjacent to the basic viewpoint as a reference image, thereby avoiding the reconstruction of the reconstructed image in the monitoring room due to the incompleteness of the collected video information. The problem of poor quality; in addition, in the distributed video coding framework of the present invention, after the video images collected by the base view are divided into Wyner-Ziv frames and first key frames, they are respectively encoded and decoded, and only The second key frame extracted from the video image acquired by the enhanced viewpoint is encoded and decoded, so the present invention also has high coding efficiency and decoding quality.
图2是本发明所述时间边信息生成模块的组成结构示意图。如图2所示,本发明所述时间信息生成模块9包括:第一预处理单元91、第一块匹配单元92、时间边信息生成单元93;其中,2 is a schematic diagram showing the structure of a time side information generating module according to the present invention. As shown in FIG. 2, the time information generating module 9 of the present invention includes: a first pre-processing unit 91, a first block matching unit 92, and a time side information generating unit 93;
第一预处理单元91,用于对来自第一帧内解码器6的两个连续的第一关键解码帧进行低通滤波处理后,将得到的两个连续的第一关键滤波帧分别分割为五十个以上的大小为M×N的基本宏块,并将各基本宏块发送至第一块匹配单元(92);其中,M、N均表示像素点个数,且为自然数。The first pre-processing unit 91 is configured to perform low-pass filtering processing on the two consecutive first key decoding frames from the first intra decoder 6 to divide the obtained two consecutive first key filtering frames into More than fifty basic macroblocks of size M×N are transmitted, and each basic macroblock is sent to the first block matching unit (92); wherein, M and N both represent the number of pixel points and are natural numbers.
第一块匹配单元92,用于在第一预处理单元91发送的各基本宏块中,根据MSE(i,j)≤δ进行搜索,并将搜索到的两个相互匹配的基本宏块发送至时间边信息生成单元93;其中,匹配函数
Figure PCTCN2015097220-appb-000001
δ为设定值,且为实数;(i,j)表示两个任意基本宏块之间的运动矢量,(x,y)、(x+i,y+j)均表示像素点坐标;fk(x,y)表示两个连续的第一关键解码帧中的当前帧在(x,y)处的像素值;fk-1(x+i,y+j)表示两个连续的第一关键解码帧中的前一帧在(x+i,y+j)处的像素值。
The first block matching unit 92 is configured to perform a search according to MSE(i,j)≤δ in each basic macroblock sent by the first pre-processing unit 91, and send the searched two matching basic macroblocks to each other. To time side information generating unit 93; wherein, the matching function
Figure PCTCN2015097220-appb-000001
δ is a set value and is a real number; (i, j) represents a motion vector between two arbitrary basic macroblocks, and (x, y), (x+i, y+j) all represent pixel point coordinates; k (x, y) represents the pixel value of the current frame in two consecutive first key decoded frames at (x, y); f k-1 (x + i, y + j) represents two consecutive numbers The pixel value of the previous frame in a key decoded frame at (x+i, y+j).
时间边信息生成单元93,用于对第一块匹配单元92发送的两个相互匹配的基本宏块采用双向运动内插法进行处理,得到时间边信息帧
Figure PCTCN2015097220-appb-000002
将时间边信息帧Y2n(p)发送至融合模块11;其中,Y2n(p)表示时间边信息帧,p表示时间边信息帧中的像素坐标;X2n-1表示两个相互匹配的基本宏块中属于两个连续的第一关键滤波帧中的前序第一关键滤波帧的基本宏 块,X2n+1表示两个相互匹配的基本宏块中属于两个连续的第一关键滤波帧中的后序第一关键滤波帧的基本宏块;MVf2n表示前向运动矢量,MVb2n表示后向运动矢量,MVf2n、MVb2n均已知。
The time side information generating unit 93 is configured to process the two mutually matching basic macroblocks sent by the first block matching unit 92 by using bidirectional motion interpolation to obtain a time side information frame.
Figure PCTCN2015097220-appb-000002
Transmitting the time side information frame Y 2n (p) to the fusion module 11; wherein Y 2n (p) represents a time side information frame, p represents a pixel coordinate in a time side information frame; X 2n-1 represents two mutually matching A basic macroblock of a preamble first critical filter frame belonging to two consecutive first key filter frames in the basic macroblock, X 2n+1 representing two consecutive first keys belonging to two mutually matching basic macroblocks The basic macroblock of the first key filtered frame in the filtered frame; MV f2n represents the forward motion vector, MV b2n represents the backward motion vector, and MV f2n and MV b2n are known.
图3是本发明所述空间边信息生成模块的组成结构示意图。如图3所示,本发明所述空间信息生成模块10包括:第二预处理单元101、第二块匹配单元102、空间边信息生成单元103;其中,FIG. 3 is a schematic diagram showing the structure of a space side information generating module according to the present invention. As shown in FIG. 3, the spatial information generating module 10 of the present invention includes: a second pre-processing unit 101, a second block matching unit 102, and a spatial side information generating unit 103;
第二预处理单元101,用于对来自第二帧内解码器8的两个连续的第二关键解码帧进行低通滤波处理后,将得到的两个连续的第二关键滤波帧分别分割为五十个以上的大小为M×N的增强宏块,并将各增强宏块发送至第二块匹配单元102;其中,M、N均表示像素点个数,且为自然数。a second pre-processing unit 101, configured to perform low-pass filtering processing on two consecutive second key decoding frames from the second intra decoder 8 to divide the obtained two consecutive second key filtering frames into Fifty or more enhanced macroblocks of size M×N are transmitted, and each enhanced macroblock is sent to the second block matching unit 102; wherein, M and N both represent the number of pixel points and are natural numbers.
第二块匹配单元102,用于在第二预处理单元101发送的各增强宏块中,根据MSE(r,s)≤γ进行搜索,并将搜索到的两个相互匹配的增强宏块发送至空间边信息生成单元103;其中,匹配函数
Figure PCTCN2015097220-appb-000003
γ为设定值,且为实数;(r,s)表示两个任意增强宏块之间的运动矢量,(x,y)、(x+r,y+s)均表示像素点坐标;fl(x,y)表示两个连续的第二关键解码帧中的当前帧在(x,y)处的像素值;fl-1(x+r,y+s)表示两个连续的第一关键解码帧中的前一帧在(x+r,y+s)处的像素值。
The second matching unit 102 is configured to perform a search according to MSE(r, s) ≤ γ in each enhanced macroblock sent by the second pre-processing unit 101, and send the searched two matched enhanced macroblocks to each other. To the spatial side information generating unit 103; wherein, the matching function
Figure PCTCN2015097220-appb-000003
γ is a set value and is a real number; (r, s) represents a motion vector between two arbitrary enhanced macroblocks, and (x, y), (x+r, y+s) represent pixel point coordinates; l (x, y) represents the pixel value of the current frame in two consecutive second key decoded frames at (x, y); f l-1 (x + r, y + s) represents two consecutive numbers The pixel value of the previous frame in a key decoded frame at (x+r, y+s).
空间边信息生成单元103,用于对第二块匹配单元102发送的两个相互匹配的增强宏块采用双向运动内插法进行处理,得到初始空间 边信息帧
Figure PCTCN2015097220-appb-000004
将初始空间边信息帧V2m发送至融合模块11;其中,V2m(q)表示初始空间边信息帧,q表示初始空间边信息帧中的像素坐标;U2m-1表示两个相互匹配的宏块中属于两个连续的第一关键滤波帧中的前序第一关键滤波帧的宏块,U2m+1表示两个相互匹配的宏块中属于两个连续的第一关键滤波帧中的后序第一关键滤波帧的宏块;MVf2m表示前向运动矢量,MVb2m表示后向运动矢量,MVf2m、MVb2m均已知。
The spatial side information generating unit 103 is configured to process the two mutually matching enhanced macroblocks sent by the second block matching unit 102 by using bidirectional motion interpolation to obtain an initial spatial side information frame.
Figure PCTCN2015097220-appb-000004
The initial spatial side information frame V 2m is sent to the fusion module 11; wherein V 2m (q) represents an initial spatial side information frame, q represents a pixel coordinate in an initial spatial side information frame; U 2m-1 represents two mutually matching a macroblock of a preamble first key filter frame belonging to two consecutive first key filter frames in a macroblock, U 2m+1 representing two consecutive first key filter frames in two mutually matching macroblocks The macroblock of the first key filter frame is followed by MV f2m for the forward motion vector, MV b2m for the backward motion vector, and MV f2m and MV b2m are known.
图4是本发明所述融合模块的组成结构示意图。如图4所示,本发明所述融合模块11包括第三预处理单元111、特征点提取单元112、基础矩阵生成单元113、映射单元114、信息融合单元115;其中,4 is a schematic view showing the structure of a fusion module according to the present invention. As shown in FIG. 4, the fusion module 11 of the present invention includes a third pre-processing unit 111, a feature point extraction unit 112, a basic matrix generation unit 113, a mapping unit 114, and an information fusion unit 115;
第三预处理单元111,用于对时间边信息生成模块9发送的时间边信息帧、空间边信息生成模块10发送的初始空间边信息帧进行滤波,并将得到的时间边信息滤波帧、初始空间边信息滤波帧发送至基础矩阵生成单元112,同时,将时间边信息滤波帧、初始空间边信息滤波帧分别发送至融合单元114、映射单元113。The third pre-processing unit 111 is configured to filter the time-side information frame sent by the time-side information generating module 9 and the initial spatial side information frame sent by the spatial side information generating module 10, and filter the obtained time-side information to filter frames and initials. The spatial side information filtering frame is sent to the basic matrix generating unit 112, and the temporal side information filtering frame and the initial spatial side information filtering frame are respectively sent to the fusion unit 114 and the mapping unit 113.
特征点提取单元112,用于在水平方向与垂直方向上,获取第三预处理单元111发送的时间边信息滤波帧、初始空间边信息滤波帧分别对应的每个像素点亮度I(x,y)、I'(x,y)的梯度,分别如下:The feature point extracting unit 112 is configured to acquire the brightness I (x, y) of each pixel point corresponding to the temporal side information filtering frame and the initial spatial side information filtering frame sent by the third preprocessing unit 111 in the horizontal direction and the vertical direction. ), I'(x, y) gradients are as follows:
Figure PCTCN2015097220-appb-000005
Figure PCTCN2015097220-appb-000006
其中,
Figure PCTCN2015097220-appb-000007
表示卷积;
Figure PCTCN2015097220-appb-000005
Figure PCTCN2015097220-appb-000006
among them,
Figure PCTCN2015097220-appb-000007
Express convolution
之后,根据上述梯度对应构建基本自相关矩阵M、增强自相关矩阵 After that, the basic autocorrelation matrix M and the enhanced autocorrelation matrix are constructed according to the above gradient correspondence.
M',分别为:M', respectively:
Figure PCTCN2015097220-appb-000008
Figure PCTCN2015097220-appb-000008
Figure PCTCN2015097220-appb-000009
Figure PCTCN2015097220-appb-000009
对基本自相关矩阵M、增强自相关矩阵M'进行平滑处理,得到对应的基础平滑自相关矩阵
Figure PCTCN2015097220-appb-000010
增强平滑自相关矩阵
Figure PCTCN2015097220-appb-000011
针对基本自相关矩阵M提取代表所述基本自相关矩阵M主曲率的两个特征点λ1、λ2,针对增强自相关矩阵M'提取代表所述增强自相关矩阵M'主曲率的两个特征点λ1'、λ2',将上述各特征点以及各特征点对应的像素坐标均发送至基础矩阵生成单元113;其中,
Figure PCTCN2015097220-appb-000012
σ2表示像素点方差;上述各特征点满足约束条件λ1·λ2-0.04·(λ12)2>δ、λ1'·λ2'-0.04·(λ1'+λ2')2>δ,δ为设定阈值。
Smoothing the basic autocorrelation matrix M and the enhanced autocorrelation matrix M' to obtain the corresponding basic smooth autocorrelation matrix
Figure PCTCN2015097220-appb-000010
Enhanced smooth autocorrelation matrix
Figure PCTCN2015097220-appb-000011
Extracting two feature points λ 1 , λ 2 representing the principal curvature of the basic autocorrelation matrix M for the basic autocorrelation matrix M, and extracting two principal curvatures representing the enhanced autocorrelation matrix M′ for the enhanced autocorrelation matrix M′ The feature points λ 1 ', λ 2 ' are sent to the basic matrix generating unit 113 for each of the feature points and the pixel coordinates corresponding to each feature point;
Figure PCTCN2015097220-appb-000012
σ 2 represents the pixel point variance; each of the above feature points satisfies the constraint condition λ 1 · λ 2 - 0.04 · (λ 1 + λ 2 ) 2 > δ, λ 1 '· λ 2 '-0.04 · (λ 1 ' + λ 2 ') 2 > δ, δ is the set threshold.
基础矩阵生成单元(113),用于根据特征点提取单元112发送的各特征点以及各特征点对应的像素坐标,获取所述基本视点(1)与所述增强视点(2)之间的自相关系数CC:a basic matrix generating unit (113), configured to acquire, according to each feature point sent by the feature point extracting unit 112 and pixel coordinates corresponding to each feature point, a self between the base view point (1) and the enhanced view point (2) Correlation coefficient CC:
Figure PCTCN2015097220-appb-000013
Figure PCTCN2015097220-appb-000013
其中,(x1,y1)、(x2,y2)分别表示特征点λ1、λ2的像素坐标,I1(x1,y1)、I2(x2,y2)分别表示特征点λ1、λ2的灰度;(x1',y1')、(x2',y2')分别表示特征点λ1'、λ2'的像素坐标,I1'(x1',y1')、I2'(x2',y2')分别表示特征点λ1、λ2的灰度; Where (x 1 , y 1 ), (x 2 , y 2 ) represent the pixel coordinates of the feature points λ 1 and λ 2 , respectively, I 1 (x 1 , y 1 ), I 2 (x 2 , y 2 ) respectively Gray scales representing feature points λ 1 , λ 2 ; (x 1 ', y 1 '), (x 2 ', y 2 ') represent pixel coordinates of feature points λ 1 ', λ 2 ', I 1 '( x 1 ', y 1 '), I 2 '(x 2 ', y 2 ') represent the gradation of the feature points λ 1 , λ 2 , respectively;
在分别以(x1,y1)、(x2,y2)、(x1',y1')、(x2',y2')为中心、大小为(2m+1)×(2m+1)的匹配窗口内,抽取6组预匹配点作为6组样本;构建线性方程组:
Figure PCTCN2015097220-appb-000014
其中,m为自然数,(a,b)、(a',b')分别表示基本视点所采集图像中的像素点、增强视点所采集图像中的像素点;h1、h2、h3分别表示三个向量;
The size is (2m+1)× centered on (x 1 , y 1 ), (x 2 , y 2 ), (x 1 ', y 1 '), (x 2 ', y 2 '), respectively. Within the matching window of 2m+1), 6 sets of pre-matching points are extracted as 6 sets of samples; construct a linear equation system:
Figure PCTCN2015097220-appb-000014
Where m is a natural number, (a, b), (a', b') respectively represent pixel points in the image acquired by the base view, and pixels in the image acquired by the enhanced view; h 1 , h 2 , h 3 respectively Represents three vectors;
根据从6组样本中随机抽取的4组样本获取h1、h2、h3;进而,得到单应矩阵H=[h1 h2 h3]T;对于6组样本中剩余的2组样本,根据xe'×Hx'=0,获取对极点e';进而,将得到的基本矩阵F=e'×H发送至映射单元114。Obtain h 1 , h 2 , h 3 from 4 sets of samples randomly selected from 6 sets of samples; further, obtain a homography matrix H=[h 1 h 2 h 3 ] T ; for the remaining 2 sets of samples in 6 sets of samples The pair of poles e' is obtained according to xe' x Hx' = 0; further, the obtained basic matrix F = e' x H is sent to the mapping unit 114.
映射单元114,通过基础矩阵生成单元113发送的基础矩阵F,将初始空间边信息滤波帧映射至基本视点1,并将得到映射空间边信息帧发送至信息融合单元15。The mapping unit 114 maps the initial spatial side information filtering frame to the base view 1 by the base matrix F transmitted by the base matrix generating unit 113, and transmits the obtained mapped spatial side information frame to the information fusion unit 15.
信息融合单元115,用于采用平均内插法,对第三预处理单元111发送的时间边信息帧与映射单元114发送的映射空间边信息帧进行融合,并将得到的融合信息帧发送至重构模块12。The information fusion unit 115 is configured to use the average interpolation method to fuse the time side information frame sent by the third preprocessing unit 111 with the mapping space side information frame sent by the mapping unit 114, and send the obtained fusion information frame to the heavy Construct module 12.
综上所述,以上仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。 In conclusion, the above is only the preferred embodiment of the present invention and is not intended to limit the scope of the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.

Claims (4)

  1. 一种分布式视频编码框架,其特征在于,所述编码框架包括基本视点、增强视点、Wyner-Ziv编码器、Wyner-Ziv解码器、第一帧内编码器、第一帧内解码器、时间边信息生成模块、第二帧内编码器、第二帧内解码器、空间边信息生成模块、融合模块、重构模块;其中,A distributed video coding framework, characterized in that the coding framework comprises a base view, an enhanced view, a Wyner-Ziv encoder, a Wyner-Ziv decoder, a first intra encoder, a first intra decoder, time An edge information generating module, a second intra encoder, a second intra decoder, a spatial side information generating module, a fusion module, and a reconstruction module; wherein
    基本视点,用于采集第一环境视频图像,根据第一环境视频图像的序号将第一环境视频图像分为第一Wyner-Ziv帧与第一关键帧,将第一Wyner-Ziv帧、第一关键帧分别发送至Wyner-Ziv编码器、第一帧内编码器;a base view for collecting the first environment video image, dividing the first environment video image into the first Wyner-Ziv frame and the first key frame according to the sequence number of the first environment video image, and the first Wyner-Ziv frame, the first The key frames are respectively sent to the Wyner-Ziv encoder and the first intra encoder;
    增强视点,用于采集第二环境视频图像,根据第二环境视频图像的序号将第二环境视频图像分为第二Wyner-Ziv帧与第二关键帧,将第二关键帧发送至第二帧内编码器;And enhancing the viewpoint for collecting the second environment video image, dividing the second environment video image into the second Wyner-Ziv frame and the second key frame according to the sequence number of the second environment video image, and sending the second key frame to the second frame Inner encoder
    Wyner-Ziv编码器,用于对基本视点发送的第一Wyner-Ziv帧进行去除像素间相关性的离散余弦变换,对将变换系数量化后形成的位平面进行信道编码,并将得到的Wyner-Ziv编码帧通过无线信道发送至Wyner-Ziv解码器;The Wyner-Ziv encoder is configured to perform a discrete cosine transform on the first Wyner-Ziv frame transmitted by the base view to remove the inter-pixel correlation, and perform channel coding on the bit plane formed by quantizing the transform coefficients, and obtain the Wyner- Ziv coded frames are sent over the wireless channel to the Wyner-Ziv decoder;
    Wyner-Ziv解码器,用于对Wyner-Ziv编码器发送的Wyner-Ziv编码帧进行解码,并将Wyner-Ziv解码帧发送至重构模块;a Wyner-Ziv decoder for decoding a Wyner-Ziv encoded frame transmitted by the Wyner-Ziv encoder and transmitting the Wyner-Ziv decoded frame to the reconstruction module;
    第一帧内编码器,用于对基本视点发送的第一关键帧进行H.264帧内编码,并将得到的第一关键编码帧通过无线信道发送至第一帧内解码器; a first intra-frame encoder, configured to perform H.264 intra-frame coding on the first key frame sent by the base view, and send the obtained first key coded frame to the first intra-frame decoder through a wireless channel;
    第一帧内解码器,用于对第一帧内编码器发送的第一关键编码帧进行H.264帧内解码,并将得到的第一关键解码帧发送至时间边信息生成模块;a first intra-frame decoder, configured to perform H.264 intra-frame decoding on the first key coded frame sent by the first intra-frame encoder, and send the obtained first key-decoded frame to the time-side information generating module;
    时间边信息生成模块,用于对来自第一帧内解码器的两个连续的第一关键解码帧依次进行预处理、块匹配、双向运动内插后,将生成的时间边信息帧发送至融合模块;a time side information generating module, configured to perform preprocessing, block matching, and bidirectional motion interpolation on two consecutive first key decoding frames from the first intra decoder, and send the generated time side information frame to the fusion Module
    第二帧内编码器,用于对增强视点发送的第二关键帧进行H.264帧内编码,并将得到的第二关键编码帧通过无线信道发送至第二帧内解码器;a second intra-frame encoder, configured to perform H.264 intra-frame coding on the second key frame sent by the enhanced view point, and send the obtained second key coded frame to the second intra-frame decoder through the wireless channel;
    第二帧内解码器,用于对第二帧内编码器发送的第二关键编码帧进行H.264帧内解码,并将得到的第二关键解码帧发送至空间信息生成模块;a second intra-frame decoder, configured to perform H.264 intra-frame decoding on the second key coded frame sent by the second intra-frame encoder, and send the obtained second key-decoded frame to the spatial information generating module;
    空间边信息生成模块,用于根据第二帧内解码器发送的第二关键解码帧进行运动估计,将得到的初始空间边信息帧发送至融合模块;a spatial side information generating module, configured to perform motion estimation according to the second key decoding frame sent by the decoder in the second frame, and send the obtained initial spatial side information frame to the fusion module;
    融合模块,用于根据基本视点与增强视点之间的相关性,通过基础矩阵将空间边信息生成模块发送的初始空间边信息帧映射到基本视点,得到映射空间边信息帧,并采用平均内插法对时间边信息生成模块发送的时间边信息帧与映射空间边信息帧进行信息融合后,将得到的融合信息帧发送至重构模块;The fusion module is configured to map the initial spatial side information frame sent by the spatial side information generating module to the basic view point according to the correlation between the basic view point and the enhanced view point, obtain the mapped spatial side information frame, and adopt average interpolation The method performs information fusion on the time side information frame sent by the time side information generating module and the mapping space side information frame, and then sends the obtained fusion information frame to the reconstruction module;
    重构模块,用于对融合模块发送的融合信息帧进行滤波,并根据Wyner-Ziv解码器发送的Wyner-Ziv解码帧、经过滤波的融合信息帧进行图像重建。 The reconstruction module is configured to filter the fusion information frame sent by the fusion module, and perform image reconstruction according to the Wyner-Ziv decoding frame and the filtered fusion information frame sent by the Wyner-Ziv decoder.
  2. 根据权利要求1所述的一种分布式视频编码框架,其特征在于,所述时间信息生成模块包括:第一预处理单元、第一块匹配单元、时间边信息生成单元;其中,The distributed video coding framework according to claim 1, wherein the time information generating module comprises: a first preprocessing unit, a first block matching unit, and a time side information generating unit;
    第一预处理单元,用于对来自所述第一帧内解码器的两个连续的第一关键解码帧进行低通滤波处理后,将得到的两个连续的第一关键滤波帧分别分割为五十个以上的大小为M×N的基本宏块,并将各基本宏块发送至第一块匹配单元;其中,M、N均表示像素点个数,且为自然数;a first pre-processing unit, configured to perform low-pass filtering processing on two consecutive first key decoding frames from the first intra decoder, and then divide the obtained two consecutive first key filtering frames into More than fifty basic macroblocks of size M×N are transmitted, and each basic macroblock is sent to the first matching unit; wherein, M and N represent the number of pixel points, and are natural numbers;
    第一块匹配单元,用于在第一预处理单元发送的各基本宏块中,根据MSE(i,j)≤δ进行搜索,并将搜索到的两个相互匹配的基本宏块发送至时间边信息生成单元;其中,匹配函数
    Figure PCTCN2015097220-appb-100001
    δ为设定值,且为实数;(i,j)表示两个任意基本宏块之间的运动矢量,(x,y)、(x+i,y+j)均表示像素点坐标;fk(x,y)表示两个连续的第一关键解码帧中的当前帧在(x,y)处的像素值;fk--1(x+i,y+j)表示两个连续的第一关键解码帧中的前一帧在(x+i,y+j)处的像素值;
    a first matching unit, configured to search according to MSE(i,j)≤δ in each basic macroblock sent by the first pre-processing unit, and send the searched two matching basic macroblocks to time Side information generating unit; wherein, the matching function
    Figure PCTCN2015097220-appb-100001
    δ is a set value and is a real number; (i, j) represents a motion vector between two arbitrary basic macroblocks, and (x, y), (x+i, y+j) all represent pixel point coordinates; k (x, y) represents the pixel value of the current frame in two consecutive first key decoded frames at (x, y); f k--1 (x + i, y + j) represents two consecutive a pixel value of the previous frame in the first key decoded frame at (x+i, y+j);
    时间边信息生成单元,用于对第一块匹配单元发送的两个相互匹配的基本宏块采用双向运动内插法进行处理,得到时间边信息帧
    Figure PCTCN2015097220-appb-100002
    将时间边信息帧Y2n(p)发送至所述融合模块(11);其中,Y2n(p)表示时间边信息帧,p表示时间边信息帧中的像素坐标;X2n--1表示两个相互匹配的基本宏块中属于两个连续的第一关键滤波帧中的前序第一关键滤波帧的 基本宏块,X2n+1表示两个相互匹配的基本宏块中属于两个连续的第一关键滤波帧中的后序第一关键滤波帧的基本宏块;MVf2n表示前向运动矢量,MVb2n表示后向运动矢量,MVf2n、MVb2n均已知。
    a time side information generating unit, configured to process two mutually matching basic macroblocks sent by the first block matching unit by using bidirectional motion interpolation to obtain a time side information frame
    Figure PCTCN2015097220-appb-100002
    Transmitting the time side information frame Y 2n (p) to the fusion module (11); wherein Y 2n (p) represents a time side information frame, p represents a pixel coordinate in a time side information frame; X 2n - 1 represents Among the two basic macroblocks that match each other, the basic macroblock of the preamble first key filter frame belonging to two consecutive first key filter frames, X 2n+1 means that two of the two basic macroblocks that match each other belong to two The basic macroblock of the first key filter frame in the successive first key filter frame; MV f2n represents the forward motion vector, MV b2n represents the backward motion vector, and MV f2n and MV b2n are all known.
  3. 根据权利要求1所述的一种分布式视频编码框架,其特征在于,所述空间信息生成模块包括:第二预处理单元、第二块匹配单元、空间边信息生成单元;其中,The distributed video coding framework according to claim 1, wherein the spatial information generation module comprises: a second pre-processing unit, a second block matching unit, and a spatial side information generating unit;
    第二预处理单元,用于对来自所述第二帧内解码器的两个连续的第二关键解码帧进行低通滤波处理后,将得到的两个连续的第二关键滤波帧分别分割为五十个以上的大小为M×N的增强宏块,并将各增强宏块发送至第二块匹配单元;其中,M、N均表示像素点个数,且为自然数;a second pre-processing unit, configured to perform low-pass filtering processing on two consecutive second key decoding frames from the second intra decoder, and then divide the obtained two consecutive second key filtering frames into More than fifty enhanced macroblocks of size M×N are transmitted, and each enhanced macroblock is sent to the second matching unit; wherein, M and N represent the number of pixel points, and are natural numbers;
    第二块匹配单元,用于在第二预处理单元发送的各增强宏块中,根据MSE(r,s)≤γ进行搜索,并将搜索到的两个相互匹配的增强宏块发送至空间边信息生成单元;其中,匹配函数
    Figure PCTCN2015097220-appb-100003
    γ为设定值,且为实数;(r,s)表示两个任意增强宏块之间的运动矢量,(x,y)、(x+r,y+s)均表示像素点坐标;fl(x,y)表示两个连续的第二关键解码帧中的当前帧在(x,y)处的像素值;fl--1(x+r,y+s)表示两个连续的第一关键解码帧中的前一帧在(x+r,y+s)处的像素值;
    a second matching unit, configured to search according to MSE(r, s)≤γ in each enhanced macroblock sent by the second pre-processing unit, and send the searched two matched enhanced macroblocks to the space Side information generating unit; wherein, the matching function
    Figure PCTCN2015097220-appb-100003
    γ is a set value and is a real number; (r, s) represents a motion vector between two arbitrary enhanced macroblocks, and (x, y), (x+r, y+s) represent pixel point coordinates; l (x, y) represents the pixel value of the current frame in two consecutive second key decoded frames at (x, y); f l--1 (x + r, y + s) represents two consecutive a pixel value of the previous frame in the first key decoded frame at (x+r, y+s);
    空间边信息生成单元,用于对第二块匹配单元发送的两个相互匹配的增强宏块采用双向运动内插法进行处理,得到初始空间边信息帧
    Figure PCTCN2015097220-appb-100004
    将初始空间边信息帧V2m发送至所述融合模块;其中,V2m(q)表示初始空间边信息帧,q表示初始空间边信息帧中的像素坐标;U2m--1表示两个相互匹配的宏块中属于两个连续的第一关键滤波帧中的前序第一关键滤波帧的宏块,U2m+1表示两个相互匹配的宏块中属于两个连续的第一关键滤波帧中的后序第一关键滤波帧的宏块;MVf2m表示前向运动矢量,MVb2m表示后向运动矢量,MVf2m、MVb2m均已知。
    The spatial side information generating unit is configured to process the two mutually matching enhanced macroblocks sent by the second matching unit by using bidirectional motion interpolation to obtain an initial spatial side information frame.
    Figure PCTCN2015097220-appb-100004
    Transmitting an initial spatial side information frame V 2m to the fusion module; wherein V 2m (q) represents an initial spatial side information frame, q represents a pixel coordinate in an initial spatial side information frame; U 2m−-1 represents two mutual A macroblock of a pre-ordered first key filter frame belonging to two consecutive first key filter frames in the matched macroblock, U 2m+1 representing two consecutive first key filters among two mutually matching macroblocks The macroblock of the first key filter frame in the frame is followed by MV f2m for the forward motion vector, MV b2m for the backward motion vector, and MV f2m and MV b2m are known.
  4. 根据权利要求1所述的一种分布式视频编码框架,其特征在于,所述融合模块包括第三预处理单元、特征点提取单元、基础矩阵生成单元、映射单元、信息融合单元;其中,The distributed video coding framework according to claim 1, wherein the fusion module comprises a third pre-processing unit, a feature point extraction unit, a basic matrix generation unit, a mapping unit, and an information fusion unit;
    第三预处理单元,用于对所述时间边信息生成模块发送的时间边信息帧、所述空间边信息生成模块发送的初始空间边信息帧进行滤波,并将得到的时间边信息滤波帧、初始空间边信息滤波帧发送至基础矩阵生成单元,同时,将时间边信息滤波帧、初始空间边信息滤波帧分别发送至融合单元、映射单元;a third pre-processing unit, configured to filter a time-side information frame sent by the time-side information generating module and an initial spatial side information frame sent by the spatial-side information generating module, and filter the obtained time-side information filtering frame, The initial spatial side information filtering frame is sent to the basic matrix generating unit, and the time side information filtering frame and the initial spatial side information filtering frame are respectively sent to the fusion unit and the mapping unit;
    特征点提取单元,用于在水平方向与垂直方向上,获取第三预处理单元发送的时间边信息滤波帧、初始空间边信息滤波帧分别对应的每个像素点亮度I(x,y)、I'(x,y)的梯度,分别如下:a feature point extracting unit, configured to acquire, in a horizontal direction and a vertical direction, a brightness information I(x, y) of each pixel point corresponding to the temporal side information filtering frame and the initial spatial side information filtering frame sent by the third preprocessing unit respectively The gradient of I'(x, y) is as follows:
    Figure PCTCN2015097220-appb-100005
    其中,
    Figure PCTCN2015097220-appb-100006
    表示卷积;
    Figure PCTCN2015097220-appb-100005
    among them,
    Figure PCTCN2015097220-appb-100006
    Express convolution
    之后,根据上述梯度对应构建基本自相关矩阵M、增强自相关矩阵 M',分别为:After that, the basic autocorrelation matrix M and the enhanced autocorrelation matrix are constructed according to the above gradient correspondence. M', respectively:
    Figure PCTCN2015097220-appb-100007
    Figure PCTCN2015097220-appb-100007
    Figure PCTCN2015097220-appb-100008
    Figure PCTCN2015097220-appb-100008
    对基本自相关矩阵M、增强自相关矩阵M'进行平滑处理,得到对应的基本平滑自相关矩阵
    Figure PCTCN2015097220-appb-100009
    增强平滑自相关矩阵
    Figure PCTCN2015097220-appb-100010
    针对基本自相关矩阵M提取代表所述基本自相关矩阵M主曲率的两个特征点λ1、λ2,针对增强自相关矩阵M'提取代表所述增强自相关矩阵M'主曲率的两个特征点λ1'、λ2',将上述各特征点以及各特征点对应的像素坐标发送至基础矩阵生成单元;其中,
    Figure PCTCN2015097220-appb-100011
    σ2表示像素点方差;上述各特征点满足约束条件λ1·λ2-0.04·(λ12)2>δ、λ1'·λ2'-0.04·(λ1'+λ2')2>δ,δ为设定阈值;
    Smoothing the basic autocorrelation matrix M and the enhanced autocorrelation matrix M' to obtain a corresponding basic smooth autocorrelation matrix
    Figure PCTCN2015097220-appb-100009
    Enhanced smooth autocorrelation matrix
    Figure PCTCN2015097220-appb-100010
    Extracting two feature points λ 1 , λ 2 representing the principal curvature of the basic autocorrelation matrix M for the basic autocorrelation matrix M, and extracting two principal curvatures representing the enhanced autocorrelation matrix M′ for the enhanced autocorrelation matrix M′ The feature points λ 1 ', λ 2 ', and the pixel coordinates corresponding to the feature points and the feature points are sent to the basic matrix generating unit;
    Figure PCTCN2015097220-appb-100011
    σ 2 represents the pixel point variance; each of the above feature points satisfies the constraint condition λ 1 · λ 2 - 0.04 · (λ 1 + λ 2 ) 2 > δ, λ 1 '· λ 2 '-0.04 · (λ 1 ' + λ 2 ') 2 > δ, δ is the set threshold;
    基础矩阵生成单元,用于根据特征点提取单元发送的各特征点以及各特征点对应的像素坐标,获取所述基本视点与所述增强视点之间的自相关系数CC:The base matrix generating unit is configured to acquire an autocorrelation coefficient CC between the base view point and the enhanced view point according to each feature point sent by the feature point extracting unit and pixel coordinates corresponding to each feature point:
    Figure PCTCN2015097220-appb-100012
    Figure PCTCN2015097220-appb-100012
    其中,(x1,y1)、(x2,y2)分别表示特征点λ1、λ2的像素坐标,I1(x1,y1)、I2(x2,y2)分别表示特征点λ1、λ2的灰度;(x1',y1')、(x2',y2')分别表示特征点λ1'、λ2'的像素坐标,I1'(x1',y1')、I2'(x2',y2')分别表示特征点λ1、λ2的灰度; Where (x 1 , y 1 ), (x 2 , y 2 ) represent the pixel coordinates of the feature points λ 1 and λ 2 , respectively, I 1 (x 1 , y 1 ), I 2 (x 2 , y 2 ) respectively Gray scales representing feature points λ 1 , λ 2 ; (x 1 ', y 1 '), (x 2 ', y 2 ') represent pixel coordinates of feature points λ 1 ', λ 2 ', I 1 '( x 1 ', y 1 '), I 2 '(x 2 ', y 2 ') represent the gradation of the feature points λ 1 , λ 2 , respectively;
    在分别以(x1,y1)、(x2,y2)、(x1',y1')、(x2',y2')为中心、大小为(2m+1)×(2m+1)的匹配窗口内,抽取6组预匹配点作为6组样本;构建线性方程组:
    Figure PCTCN2015097220-appb-100013
    其中,m为自然数,(a,b)、(a',b')分别表示基本视点所采集图像中的像素点、增强视点所采集图像中的像素点;h1、h2、h3分别表示三个向量;
    The size is (2m+1)× centered on (x 1 , y 1 ), (x 2 , y 2 ), (x 1 ', y 1 '), (x 2 ', y 2 '), respectively. Within the matching window of 2m+1), 6 sets of pre-matching points are extracted as 6 sets of samples; construct a linear equation system:
    Figure PCTCN2015097220-appb-100013
    Where m is a natural number, (a, b), (a', b') respectively represent pixel points in the image acquired by the base view, and pixels in the image acquired by the enhanced view; h 1 , h 2 , h 3 respectively Represents three vectors;
    根据从6组样本中随机抽取的4组样本获取h1、h2、h3;进而,得到单应矩阵H=[h1 h2 h3]T;对于6组样本中剩余的2组样本,根据xe'×Hx'=0,获取对极点e';进而,将得到的基本矩阵F=e'×H发送至映射单元;Obtain h 1 , h 2 , h 3 from 4 sets of samples randomly selected from 6 sets of samples; further, obtain a homography matrix H=[h 1 h 2 h 3 ] T ; for the remaining 2 sets of samples in 6 sets of samples Obtaining a pair of poles e' according to xe'×Hx'=0; further, transmitting the obtained basic matrix F=e'×H to the mapping unit;
    映射单元,通过基础矩阵生成单元发送的基础矩阵F,将初始空间边信息滤波帧映射至所述基本视点,并将得到映射空间边信息帧发送至信息融合单元;a mapping unit, the initial spatial side information filtering frame is mapped to the basic view point by the basic matrix F sent by the basic matrix generating unit, and the obtained mapped spatial side information frame is sent to the information fusion unit;
    信息融合单元,用于采用平均内插法,对第三预处理单元发送的时间边信息帧与映射单元发送的映射空间边信息帧进行融合,并将得到的融合信息帧发送至所述重构模块。 An information fusion unit is configured to combine the time side information frame sent by the third preprocessing unit with the mapping space side information frame sent by the mapping unit by using an average interpolation method, and send the obtained fusion information frame to the reconstruction Module.
PCT/CN2015/097220 2015-12-04 2015-12-12 Distributed video encoding framework WO2017092072A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510883301.8A CN105430406B (en) 2015-12-04 2015-12-04 A kind of distributed video coding frame
CN2015108833018 2015-12-04

Publications (1)

Publication Number Publication Date
WO2017092072A1 true WO2017092072A1 (en) 2017-06-08

Family

ID=55508294

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/097220 WO2017092072A1 (en) 2015-12-04 2015-12-12 Distributed video encoding framework

Country Status (2)

Country Link
CN (1) CN105430406B (en)
WO (1) WO2017092072A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111479114A (en) * 2019-01-23 2020-07-31 华为技术有限公司 Point cloud encoding and decoding method and device
CN115002482A (en) * 2022-04-27 2022-09-02 电子科技大学 End-to-end video compression method and system using structural preservation motion estimation
CN115767108A (en) * 2022-10-20 2023-03-07 哈尔滨工业大学(深圳) Distributed image compression method and system based on feature domain matching

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110392258B (en) * 2019-07-09 2021-03-16 武汉大学 Distributed multi-view video compression sampling reconstruction method combining space-time side information

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070121722A1 (en) * 2005-11-30 2007-05-31 Emin Martinian Method and system for randomly accessing multiview videos with known prediction dependency
CN104093030A (en) * 2014-07-09 2014-10-08 天津大学 Distributed video coding side information generating method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI279143B (en) * 2005-07-11 2007-04-11 Softfoundry Internat Ptd Ltd Integrated compensation method of video code flow
US8599929B2 (en) * 2009-01-09 2013-12-03 Sungkyunkwan University Foundation For Corporate Collaboration Distributed video decoder and distributed video decoding method
CN102611893B (en) * 2012-03-09 2014-02-19 北京邮电大学 DMVC (distributed multi-view video coding) side-information integration method on basis of histogram matching and SAD (security association database) judgment
CN103002283A (en) * 2012-11-20 2013-03-27 南京邮电大学 Multi-view distributed video compression side information generation method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070121722A1 (en) * 2005-11-30 2007-05-31 Emin Martinian Method and system for randomly accessing multiview videos with known prediction dependency
CN104093030A (en) * 2014-07-09 2014-10-08 天津大学 Distributed video coding side information generating method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DING, JINQING;: "The Research on Side Information Generation Technology for Multi-View Distributed Video Coding", MASTER'S DISSERTATION OF NANJING UNIVERSITY OF POSTS AND TELECOMMUNICATIONS, 15 May 2015 (2015-05-15) *
LI, JIE;: "The Research on Side Information Generation for Distributed Multi-View Video Coding", MASTER'S DISSERTATION OF NANJING UNIVERSITY OF POSTS AND TELECOMMUNICATIONS., 15 June 2013 (2013-06-15) *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111479114A (en) * 2019-01-23 2020-07-31 华为技术有限公司 Point cloud encoding and decoding method and device
CN111479114B (en) * 2019-01-23 2022-07-22 华为技术有限公司 Point cloud encoding and decoding method and device
CN115002482A (en) * 2022-04-27 2022-09-02 电子科技大学 End-to-end video compression method and system using structural preservation motion estimation
CN115002482B (en) * 2022-04-27 2024-04-16 电子科技大学 End-to-end video compression method and system using structural preserving motion estimation
CN115767108A (en) * 2022-10-20 2023-03-07 哈尔滨工业大学(深圳) Distributed image compression method and system based on feature domain matching
CN115767108B (en) * 2022-10-20 2023-11-07 哈尔滨工业大学(深圳) Distributed image compression method and system based on feature domain matching

Also Published As

Publication number Publication date
CN105430406A (en) 2016-03-23
CN105430406B (en) 2018-06-12

Similar Documents

Publication Publication Date Title
WO2017092072A1 (en) Distributed video encoding framework
CN107027025B (en) A kind of light field image compression method based on macro block of pixels adaptive prediction
CN102611893B (en) DMVC (distributed multi-view video coding) side-information integration method on basis of histogram matching and SAD (security association database) judgment
KR19990074806A (en) Texture Padding Apparatus and its Padding Method for Motion Estimation in Parallel Coding
CN104363460A (en) Three-dimensional image coding method based on three-dimensional self-organized mapping
CN105357523A (en) High-order singular value decomposition (HOSVD) algorithm based video compression system and method
Fang et al. 3dac: Learning attribute compression for point clouds
Kaaniche et al. Vector lifting schemes for stereo image coding
Wang et al. Fast depth video compression for mobile RGB-D sensors
JP3955910B2 (en) Image signal processing method
Tran et al. Bi-directional intra prediction based measurement coding for compressive sensing images
JP3955909B2 (en) Image signal processing apparatus and method
Cossalter et al. Privacy-enabled object tracking in video sequences using compressive sensing
Yoo et al. Enhanced compression of integral images by combined use of residual images and MPEG-4 algorithm in three-dimensional integral imaging
Angayarkanni et al. Distributed compressive video coding using enhanced side information for WSN
Peng et al. An optimized algorithm based on generalized difference expansion method used for HEVC reversible video information hiding
CN107509074A (en) Adaptive 3 D video coding-decoding method based on compressed sensing
Rizkallah et al. Graph-based spatio-angular prediction for quasi-lossless compression of light fields
Deng et al. MASIC: Deep Mask Stereo Image Compression
Liu et al. Disparity-compensated total-variation minimization for compressed-sensed multiview image reconstruction
KR102127212B1 (en) Method and apparatus for decoding multi-view video information
El Kerek et al. A new technique to multiplex stereo images: LSB watermarking and Hamming code
CN104427323A (en) Depth-based three-dimensional image processing method
Zhu et al. Efficient shape coding for object-based 3D video applications
Ouddane et al. Stereo image coding: State of the art

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15909586

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15909586

Country of ref document: EP

Kind code of ref document: A1