WO2014079303A1 - Method, device and system for synthesizing multi-screen video - Google Patents

Method, device and system for synthesizing multi-screen video Download PDF

Info

Publication number
WO2014079303A1
WO2014079303A1 PCT/CN2013/086014 CN2013086014W WO2014079303A1 WO 2014079303 A1 WO2014079303 A1 WO 2014079303A1 CN 2013086014 W CN2013086014 W CN 2013086014W WO 2014079303 A1 WO2014079303 A1 WO 2014079303A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
channel
videos
module
decoding module
Prior art date
Application number
PCT/CN2013/086014
Other languages
French (fr)
Chinese (zh)
Inventor
贾少华
桂志渊
刘克华
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2014079303A1 publication Critical patent/WO2014079303A1/en

Links

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/14Display of multiple viewports
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2340/00Aspects of display data processing
    • G09G2340/04Changes in size, position or resolution of an image
    • G09G2340/045Zooming at least part of an image, i.e. enlarging it or shrinking it
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • Video multi-picture synthesis method device and system
  • the present invention relates to video conferencing technology, and in particular, to a video multi-picture synthesis method, apparatus and system. Background technique
  • the HD conference TV terminal usually adopts the hardware architecture as shown in FIG. 1.
  • the working principle of the HD conference television terminal is as follows:
  • the network communication module 110 receives the network packet transmitted by the remote conference television terminal, and sends it to the main control processor 109 for unpacking. Obtaining a compressed video stream of the far end, and then transmitting the compressed video data to the decoding module 105 through the system bus 108 between the decoding module, and after decompressing the video data, the decoding module 105 obtains the data in the original RAW format, and then decodes the data.
  • the video interface (Video Port, VP) 106 of the module 105 is packaged into standard BT.1120 format video data and sent to a Video Processing Field Programmable Gate Array (FPGA), that is, 107.
  • FPGA Video Processing Field Programmable Gate Array
  • the local video is sent to the video switching matrix 103 through the video input interface module 101, and the switching matrix 103 also sends the video data to the video processing FPGA 107 according to the system configuration.
  • the video processing FPGA 107 performs the video scaling and multi-picture synthesis of the obtained far-end and local video according to the system configuration, and then outputs the display from the video output interface module 102 through the video switching matrix 103.
  • the encoding module 104 obtains the locally input video from the video processing FPGA 107, compresses and encodes the original image, reduces the image bit rate, and then transmits the compressed code stream to the main control processor 109 through the system bus 108 for network packaging, and then passes through the network communication module. 110 Transfer to the far end. In this way, the process of peer-to-peer communication between two conference television terminals is completed.
  • the parallel VP interface is used for data transmission between the encoding and decoding modules and the video processing FPGA.
  • the VP interface is a 16-bit data bus with a very small bandwidth and a small amount of data that can be transmitted. Up to one video data of 1080P60 can be transmitted at most. With HD conference TV final
  • the terminal can realize the function of the built-in Multipoint Control Unit (MCU), and the data to be transmitted between the encoding and decoding module and the video processing FPGA is greatly increased, and the parallel VP interface can no longer meet the needs of data transmission.
  • MCU built-in Multipoint Control Unit
  • the decoding module When there are multiple high-resolution, high-frame-rate decoded video that needs to be transmitted, the decoding module needs to scale multiple channels of video to reduce the bandwidth of the data stream, and then transmit it to the video processing FPGA through the VP interface.
  • the video processing FPGA needs to perform video processing. Secondary scaling and picture extraction, followed by multi-picture synthesis, increases system complexity, not only wastes system resources, but also reduces image quality.
  • the parallel VP interface will occupy a lot of printed circuit board (PCB) wiring space; when the video clock frequency is high, especially when the video is 1080P60, the bus timing is difficult to control.
  • PCB printed circuit board
  • the main objective of the embodiments of the present invention is to provide a video multi-picture synthesis method, apparatus and system, which can save system resources and improve data transmission speed and image quality.
  • An embodiment of the present invention provides a video multi-picture synthesis method, where the method includes: a video processing FPGA receives a multi-channel video and a corresponding address sent by a decoding module through a high-speed serial bus, and the address of each video is decoded.
  • the module is determined according to the requirements of the multi-screen layout;
  • the received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
  • the method further includes: The data sent from the serial bus is deserialized, the valid data is parsed, and the valid data is processed in parallel to obtain parallel data.
  • the scaling of the received multi-channel video is:
  • the received multi-channel video is scaled according to the image quality requirement by selecting a neighborhood interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm.
  • the method before the storing the video in the corresponding memory space according to the modified address, the method further includes:
  • the video is stored in the corresponding memory space, as follows:
  • the selected videos are sequentially stored in the corresponding memory space.
  • the embodiment of the present invention provides a video processing FPGA, where the video processing FPGA includes: a high-speed serial bus controller configured to receive multiple channels of video and corresponding addresses sent by the decoding module through a high-speed serial bus, The address of the road video is determined by the decoding module according to the requirements of the multi-screen layout;
  • the scaling module is configured to scale the multi-channel video received by the high-speed serial bus controller, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
  • the frame buffer module is configured to cache the scaled videos, and respectively correct the addresses corresponding to the cached video channels;
  • the memory controller is configured to store the scaled video channels into the corresponding memory space according to the corrected address of the frame buffer module.
  • the high-speed serial bus controller is further configured to perform deserialization processing on the data sent by the decoding module through the high-speed serial bus, parse the valid data, and perform parallel processing on the valid data to obtain parallel data.
  • the scaling module has a configured to select a neighboring domain interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm according to the image quality requirement, and scale the received multiple video.
  • the video processing FPGA further includes: an arbitration module;
  • the arbitration module is configured to sequentially select a video to be stored in the memory space from the multi-channel video buffered by the frame buffer module by using a round-robin mechanism;
  • the memory controller is configured to sequentially store the video selected by the arbitration module into a corresponding memory space.
  • the frame buffer module is composed of a one-hot state machine, and each state corresponds to one frame of data.
  • An embodiment of the present invention provides a video multi-view synthesis system, where the system includes: a decoding module and a video processing FPGA, where
  • the decoding module is configured to determine, according to the requirements of the multi-screen layout, respective addresses corresponding to the multi-channel videos decoded by the decoder, and decode the corresponding addresses of the multi-channel video and the determined multi-channel video through the high-speed serial bus. Send to the video processing FPGA;
  • the video processing FPGA is configured to receive the multi-channel video and the corresponding address sent by the decoding module through the high-speed serial bus, and the address of each video is determined by the decoding module according to the requirement of the multi-screen layout;
  • the received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
  • the scaled video is cached, and the addresses corresponding to the cached videos are respectively corrected.
  • the technical solution of the embodiment of the present invention includes: a video processing field programmable gate array (FPGA) through a high speed serial bus Receiving multiple channels of video and corresponding addresses sent by the decoding module, and the addresses of the respective channels are determined by the decoding module according to the requirements of the multi-screen layout; the received multiple channels of video are scaled, and the scaled videos are respectively The size is the same as the size of the corresponding sub-picture in the multi-screen; the cached video is cached, and corresponding to each cached video The memory space, by which data is transferred over the high-speed serial bus, saves system resources and increases data transfer speed and image quality.
  • FPGA field programmable gate array
  • FIG. 1 is a schematic diagram of a hardware architecture of an existing HD conference television terminal
  • FIG. 2 is a schematic flowchart of an implementation of a video multi-screen synthesis method according to a first embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of an embodiment of a video processing FPGA according to the present invention
  • FIG. 4 is a schematic structural diagram of an embodiment of a decoding module according to the present invention.
  • FIG. 5 is a schematic structural diagram of an embodiment of a video multi-picture synthesizing system according to the present invention.
  • FIG. 6 is a schematic diagram of an implementation flow of a second embodiment of a video multi-picture synthesis method according to the present invention
  • FIG. 7 is a schematic diagram of a three-way sub-picture synthesis structure according to an embodiment of the present invention.
  • a first embodiment of a video multi-picture synthesis method is provided by the present invention. As shown in FIG. 2, the method includes:
  • Step 201 The video processing FPGA receives the multi-channel video and the corresponding address sent by the decoding module through the high-speed serial bus, and the address of each video is determined by the decoding module according to the requirement of the multi-screen layout;
  • Step 202 The received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
  • Step 203 Cache the scaled video, and correct the address corresponding to each cached video; space.
  • the method further includes: The data sent from the decoding module through the high-speed serial bus is subjected to deserialization processing, the valid data is parsed, and the valid data is processed in parallel to obtain parallel data.
  • the scaling the received multiple video is:
  • the received multi-channel video is scaled according to the image quality requirement by selecting a neighborhood interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm.
  • the method before the storing the video in the corresponding memory space according to the modified address, the method further includes:
  • the video is stored in the corresponding memory space, as follows:
  • the selected videos are sequentially stored in the corresponding memory space.
  • the video processing FPGA includes:
  • the high-speed serial bus controller 301 is configured to receive the multi-channel video and the corresponding address sent by the decoding module through the high-speed serial bus, and the address of each video is determined by the decoding module according to the requirement of the multi-screen layout;
  • the scaling module 302 is configured to scale the multi-channel video received by the high-speed serial bus controller 301, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
  • the frame buffer module 303 is configured to Cache each of the scaled videos, and respectively correct the addresses corresponding to the cached videos;
  • the memory controller 304 is configured to store the scaled videos in the corresponding memory space according to the corrected address of the frame buffer module 303.
  • the high-speed serial bus controller 301 is further configured to perform deserialization processing on data sent by the decoding module through the high-speed serial bus, parse the valid data, and perform parallel processing on the valid data to obtain parallel data.
  • the scaling module 302 has a configured to select a neighboring domain interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm according to the image quality requirement, and scale the received multiple video.
  • the video processing FPGA further includes: an arbitration module 305;
  • the arbitration module 305 is configured to sequentially select a video to be stored in the memory space from the multi-channel video buffered by the frame buffer module 303 by using a round-robin mechanism;
  • the arbitration module 305 uses a one-hot encoding state machine, each state represents a request, and the arbitration uses a polling mechanism to ensure that the requests are fair and timely.
  • the memory controller 304 is specifically configured to sequentially store the video selected by the arbitration module 305 into a corresponding memory space.
  • the frame buffer module 303 is composed of a one-hot state machine, and each state corresponds to one frame of data.
  • the codec format may be different. Therefore, when the video is buffered, three frames of buffer space are opened, and the state of the three frames is marked. Assuming that the current first frame state is empty, when the high speed serial bus controller 301 writes video data to the frame buffer module 303, and the frame buffer module 303 finds that the first frame state is empty, the jump enters the first frame. The state, after the first frame is written, the first frame is marked as full, indicating that the frame has been filled with video data and can be read. At this time, the frame buffer module 303 determines whether the state of the next frame is empty, and if it is empty, jumps to the second frame to start writing when the decoding module transmits data.
  • the frame buffer module 303 continues to remain in the state of the first frame, when the decoding module passes the high speed serial When the bus controller 301 writes video data
  • the frame buffer module 303 overwrites the data of the original first frame.
  • the frame buffer module 303 cooperates with the frame reading module 306 to complete the frame rate conversion function of the frame loss.
  • the frame read module 306 is configured to read the synthesized video multi-picture from the memory.
  • the basic structure of the frame read module 306 is the same as the frame buffer module 303, and is also composed of a one-hot state machine.
  • the frame reading module 306 selects one frame from its buffered three-frame data for reading under the control of the state machine. Only frames marked as full can be read, and the status of this frame is set to empty after reading. If the first frame data is just read, the encoding module sends the read command again. At this time, the frame reading module 306 determines whether the state of the next frame data is full.
  • the frame read module 306 If it is full, it indicates that one frame has just been written. The data can be read, then the frame read module 306 skips to the state of the next frame and reads the data. If it is empty, indicating that the frame data is not ready yet, the frame reading module 306 maintains the current state, and reads the frame data just read again, thus completing the frame rate conversion of the frame copy.
  • the decoding module includes: an address determining unit 401, configured to determine, according to a requirement of a multi-screen layout, an address corresponding to each of the decoded multi-channel videos;
  • the transmitting unit 402 is configured to send the address corresponding to each of the decoded multi-channel video and the determined multi-channel video to the video processing FPGA through the high-speed serial bus.
  • the embodiment of the present invention provides a video multi-screen synthesis system.
  • the system includes: a decoding module 501 and a video processing FPGA 502, where
  • the decoding module 501 is configured to determine, according to the requirements of the multi-screen layout, respective addresses corresponding to the multiple channels of video decoded by the decoder, and decode the obtained multiple channels of video and the corresponding addresses of the determined multiple channels of video through high-speed serial
  • the bus is sent to the video processing FPGA 502;
  • the video processing FPGA 502 is configured to receive the multi-channel video and the corresponding address sent by the decoding module 501 through the high-speed serial bus, and the address of each video is the decoding module 501.
  • the requirements of the multi-screen layout are determined;
  • the received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
  • the encoding and decoding module uses a digital signal processor (DSP) TMS320TCI6608, and the TMS320TCI6608 is a multi-core fixed-point/floating point DSP with a frequency of up to 1.25G, which can simultaneously encode or decode two 1080P60 formats.
  • DSP digital signal processor
  • Video support rapioIO high-speed serial bus; video processing FPGA uses EP4S110GXF1120, the EP4S110GXF1120 embedded 32 serial transceivers, can achieve PCIe, rapidIO and other high-speed serial protocols.
  • the rapidIO interconnection is used between the DSP and the video processing FPGA to transmit video data
  • the high-speed serial bus controller is a RapidIO controller.
  • Video Processing FPGA's rapidIO can support up to 3.125G in a 4x configuration, so the total bandwidth is 12.5G, except for protocol overhead, with an effective bandwidth of 10G.
  • One channel of 1080P60 video has a valid data bandwidth of 2G, so it is enough to transmit five channels of 1080P60 raw valid data.
  • Video processing FPGA external four-speed double-rate synchronous dynamic random access memory 3 ( Double Data Rate, DDR3), each DDR3 memory 16-bit 2Gbits, the rate is 800Mbps, so the total memory bandwidth is 51.2Gbps. It is assumed that a three-picture image synthesis of a character shape is realized, and the codec is a video system of 1080P30.
  • Step 601 The video processing FPGA receives the multi-channel video and the corresponding address sent by the decoding module through the high-speed serial bus, and the address of each video is determined by the decoding module according to the requirement of the multi-screen layout;
  • the high-speed serial bus can effectively save the wiring space of the PCB, and the bandwidth of the high-speed serial bus is much larger than that of the VP interface.
  • Step 602 Perform deserialization processing on the data sent by the decoding module through the high-speed serial bus, analyze the valid data, and perform parallel processing on the valid data to obtain parallel data.
  • Step 603 The received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
  • the received multi-channel video is scaled according to the requirement of image quality by selecting a neighborhood interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm. If the image quality is not high, select the neighborhood interpolation algorithm; if the image quality is high, choose the bilinear interpolation algorithm; if the image quality is very high, the multi-phase interpolation algorithm;
  • a bilinear algorithm can be used.
  • the bilinear algorithm will produce a certain ringing effect on the image, its image quality can meet the requirements in the conference television application scenario.
  • the bilinear algorithm only needs to use 4 pixels in the original image to generate one pixel in the target image, and the amount of computation and complexity are relatively small.
  • the output is 1080P30 video
  • the multi-picture is a three-picture superposition of the character shape, so the line and column of each sub-picture is half of the original image line, so the scaling is 1/2.
  • Step 604 Cache the scaled videos, and correct the addresses corresponding to the cached videos.
  • each row and column address is only half of the original image.
  • the storage address of each pixel is recalculated according to the size of the sub-picture after scaling and the starting position in the multi-picture, so that the three-way sub-picture is accurate.
  • the video is cached, that is, a random access memory with two line buffers inside the video processing FPGA (Random Access) Memory, RAM), when the video transmitted by the decoding DSP writes the first line of RAM, it writes the second line of RAM, and at the same time generates a write request signal to the arbitration module, and after obtaining the authorization response of the arbitration, Write the data stored in the first row of RAM to DDR3 memory.
  • the second row RAM is full of data and then switched to the first row of RAM to write, this realizes the operation mode of PINGPONG, which can improve the storage efficiency of DDR3 memory, and realize when the three-way sub-pictures are all stored to the corresponding positions.
  • the reading command is issued by the RapidIO controller, and the frame reading module judges that after one frame of the complete multi-picture storage is completed, the frame data is read, and the frame reading is performed.
  • the PINGPONG buffer method is also used internally to synchronize the speed of DDR3 memory and RapidIO controller, and improve the read efficiency of DDR3 memory.
  • Step 605 Select, by using a round-robin mechanism, the video to be stored in the memory space from the cached multi-channel video.
  • Step 606 Store the selected videos into the corresponding memory space according to the corrected address.
  • the present invention provides a video multi-picture synthesis method, apparatus and system, wherein the method comprises: a video processing field programmable gate array receiving a multi-channel video and a corresponding address sent by a decoding module through a high-speed serial bus.
  • the address of each video is determined by the decoding module according to the requirements of the multi-screen layout; the received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
  • the road videos are stored in the corresponding memory space.
  • the invention can save system resources and improve data transmission speed and image quality.

Abstract

Disclosed are a method, device and system for synthesizing a multi-screen video. The method comprises: a video processing field programmable gate array (FPGA) receiving, through a high-speed serial bus, multiple videos and respective corresponding addresses sent by a decoding module, the addresses of the videos being determined by the decoding module according to a multi-screen arrangement requirement; zooming the multiple received videos, so that the size of each video is the same as that of a corresponding screen of multiple screens; buffering the zoomed videos, and modifying the address corresponding to each of the buffered videos; storing each of the zoomed videos in a corresponding memory space according to the modified address. The present invention can conserve system resources, and improve the data transmission speed and image quality.

Description

一种视频多画面合成方法、 装置和系统 技术领域  Video multi-picture synthesis method, device and system
本发明涉及视频会议技术, 尤其涉及一种视频多画面合成方法、 装置 和系统。 背景技术  The present invention relates to video conferencing technology, and in particular, to a video multi-picture synthesis method, apparatus and system. Background technique
高清会议电视终端通常采用如图 1 所示的硬件架构, 高清会议电视终 端的工作原理为: 网络通讯模块 110接收远端会议电视终端传送过来的网 络包, 送给主控处理器 109进行拆包, 得到远端的压缩视频码流, 然后通 过与解码模块之间的系统总线 108将压缩视频数据传给解码模块 105,解码 模块 105解压缩视频数据之后,得到原始 RAW格式的数据,之后通过解码 模块 105的视频接口 (Video Port, VP ) 106封装成标准 BT.1120格式视频 数据, 送给视频处理现场可编程门阵列 ( Field Programmable Gate Array, FPGA ), 即 107。 同时本地视频通过视频输入接口模块 101, 输入送给视频 切换矩阵 103, 切换矩阵 103 根据系统配置将视频数据也送给视频处理 FPGA 107。 视频处理 FPGA 107将得到的远端和本地视频按照系统配置进 行视频缩放和多画面合成, 然后再通过视频切换矩阵 103,从视频输出接口 模块 102输出显示。编码模块 104从视频处理 FPGA 107得到本地输入的视 频后对原始图像进行压缩编码, 降低图像码率, 之后通过系统总线 108将 压缩码流传给主控处理器 109 进行网络打包, 然后通过网络通讯模块 110 传送到远端。 这样就完成两个会议电视终端点对点互通的流程。  The HD conference TV terminal usually adopts the hardware architecture as shown in FIG. 1. The working principle of the HD conference television terminal is as follows: The network communication module 110 receives the network packet transmitted by the remote conference television terminal, and sends it to the main control processor 109 for unpacking. Obtaining a compressed video stream of the far end, and then transmitting the compressed video data to the decoding module 105 through the system bus 108 between the decoding module, and after decompressing the video data, the decoding module 105 obtains the data in the original RAW format, and then decodes the data. The video interface (Video Port, VP) 106 of the module 105 is packaged into standard BT.1120 format video data and sent to a Video Processing Field Programmable Gate Array (FPGA), that is, 107. At the same time, the local video is sent to the video switching matrix 103 through the video input interface module 101, and the switching matrix 103 also sends the video data to the video processing FPGA 107 according to the system configuration. The video processing FPGA 107 performs the video scaling and multi-picture synthesis of the obtained far-end and local video according to the system configuration, and then outputs the display from the video output interface module 102 through the video switching matrix 103. The encoding module 104 obtains the locally input video from the video processing FPGA 107, compresses and encodes the original image, reduces the image bit rate, and then transmits the compressed code stream to the main control processor 109 through the system bus 108 for network packaging, and then passes through the network communication module. 110 Transfer to the far end. In this way, the process of peer-to-peer communication between two conference television terminals is completed.
目前,编、解码模块和视频处理 FPGA之间都是使用并行的 VP接口进 行数据传输。 VP接口为 16位数据总线, 带宽非常小, 能够传输的数据量 很小, 最多只能传输一路 1080P60制式的视频数据。 随着高清会议电视终 端能够实现内置多点控制单元(Multipoint Control Unit, MCU )的功能,编、 解码模块和视频处理 FPGA之间需要传输的数据大量增加,并行的 VP接口 已经不能满足数据传输的需要。 当有多路高分辨率高帧频制式的解码视频 需要传输时, 解码模块需要将多路视频进行缩放, 降低数据流带宽, 然后 通过 VP接口传送给视频处理 FPGA,视频处理 FPGA需要对视频进行二次 缩放和画面提取, 之后再进行多画面合成, 增加了系统复杂度, 不仅浪费 系统资源, 也降低了图像质量。 此外, 并行的 VP接口会占用很多的印制电 路板 ( Printed Circuit Board, PCB )布线空间; 视频时钟频率较高的时候, 特别是视频为 1080P60制式时, 总线时序很难控制。 发明内容 At present, the parallel VP interface is used for data transmission between the encoding and decoding modules and the video processing FPGA. The VP interface is a 16-bit data bus with a very small bandwidth and a small amount of data that can be transmitted. Up to one video data of 1080P60 can be transmitted at most. With HD conference TV final The terminal can realize the function of the built-in Multipoint Control Unit (MCU), and the data to be transmitted between the encoding and decoding module and the video processing FPGA is greatly increased, and the parallel VP interface can no longer meet the needs of data transmission. When there are multiple high-resolution, high-frame-rate decoded video that needs to be transmitted, the decoding module needs to scale multiple channels of video to reduce the bandwidth of the data stream, and then transmit it to the video processing FPGA through the VP interface. The video processing FPGA needs to perform video processing. Secondary scaling and picture extraction, followed by multi-picture synthesis, increases system complexity, not only wastes system resources, but also reduces image quality. In addition, the parallel VP interface will occupy a lot of printed circuit board (PCB) wiring space; when the video clock frequency is high, especially when the video is 1080P60, the bus timing is difficult to control. Summary of the invention
有鉴于此, 本发明实施例的主要目的在于提供一种视频多画面合成方 法、 装置和系统, 能够节省系统资源, 而且提高数据传输速度和图像质量。  In view of this, the main objective of the embodiments of the present invention is to provide a video multi-picture synthesis method, apparatus and system, which can save system resources and improve data transmission speed and image quality.
为达到上述目的, 本发明实施例的技术方案是这样实现的:  To achieve the above objective, the technical solution of the embodiment of the present invention is implemented as follows:
本发明实施例提供了一种视频多画面合成方法, 所述方法包括: 视频处理 FPGA通过高速串行总线, 接收解码模块发来的多路视频和 各自对应的地址, 各路视频的地址为解码模块按照多画面布局的要求确定 的;  An embodiment of the present invention provides a video multi-picture synthesis method, where the method includes: a video processing FPGA receives a multi-channel video and a corresponding address sent by a decoding module through a high-speed serial bus, and the address of each video is decoded. The module is determined according to the requirements of the multi-screen layout;
将收到的多路视频进行缩放, 缩放后的各路视频的大小分别与多画面 中相应子画面的大小相同;  The received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
緩存缩放后的各路视频, 并分别对緩存的各路视频对应的地址进行修 正; 较佳地, 所述将收到的多路视频进行缩放之前, 所述方法还包括: 将解码模块通过高速串行总线发来的数据进行解串处理, 解析出有效 数据, 并对所述有效数据进行并行处理, 得到并行数据。 较佳地, 所述将收到的多路视频进行缩放, 为: Cache the scaled video and correct the address corresponding to each cached video. Preferably, before the scaling the received multiple video, the method further includes: The data sent from the serial bus is deserialized, the valid data is parsed, and the valid data is processed in parallel to obtain parallel data. Preferably, the scaling of the received multi-channel video is:
根据对图像质量的要求选择临近域插值算法、 双线性内插算法或多相 位插值算法, 将收到的多路视频进行缩放。  The received multi-channel video is scaled according to the image quality requirement by selecting a neighborhood interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm.
较佳地, 所述根据修正后的地址将各路视频分别存进相应的内存空间 之前, 所述方法还包括:  Preferably, before the storing the video in the corresponding memory space according to the modified address, the method further includes:
通过轮询( round-robin )机制从緩存的多路视频中依次选取要存进内存 空间的视频;  Selecting a video to be stored in the memory space from the cached multi-channel video by a round-robin mechanism;
相应的, 所述将各路视频分别存进相应的内存空间, 为:  Correspondingly, the video is stored in the corresponding memory space, as follows:
将选取的视频依次存进相应的内存空间。  The selected videos are sequentially stored in the corresponding memory space.
本发明实施例提供了一种视频处理 FPGA, 所述视频处理 FPGA包括: 高速串行总线控制器, 配置为通过高速串行总线, 接收解码模块发来 的多路视频和各自对应的地址, 各路视频的地址为解码模块按照多画面布 局的要求确定的;  The embodiment of the present invention provides a video processing FPGA, where the video processing FPGA includes: a high-speed serial bus controller configured to receive multiple channels of video and corresponding addresses sent by the decoding module through a high-speed serial bus, The address of the road video is determined by the decoding module according to the requirements of the multi-screen layout;
缩放模块, 配置为将高速串行总线控制器收到的多路视频进行缩放, 缩放后的各路视频的大小分别与多画面中相应子画面的大小相同;  The scaling module is configured to scale the multi-channel video received by the high-speed serial bus controller, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
帧緩存模块, 配置为緩存缩放后的各路视频, 并分别对緩存的各路视 频对应的地址进行爹正;  The frame buffer module is configured to cache the scaled videos, and respectively correct the addresses corresponding to the cached video channels;
内存控制器, 配置为根据帧緩存模块修正后的地址将缩放后的各路视 频分别存进相应的内存空间。  The memory controller is configured to store the scaled video channels into the corresponding memory space according to the corrected address of the frame buffer module.
较佳地, 所述高速串行总线控制器, 还配置为将解码模块通过高速串 行总线发来的数据进行解串处理, 解析出有效数据, 并对所述有效数据进 行并行处理, 得到并行数据。  Preferably, the high-speed serial bus controller is further configured to perform deserialization processing on the data sent by the decoding module through the high-speed serial bus, parse the valid data, and perform parallel processing on the valid data to obtain parallel data.
较佳地, 所述缩放模块, 具有配置为根据对图像质量的要求选择临近 域插值算法、 双线性内插算法或多相位插值算法, 将收到的多路视频进行 缩放。 较佳地, 所述视频处理 FPGA还包括: 仲裁模块; Preferably, the scaling module has a configured to select a neighboring domain interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm according to the image quality requirement, and scale the received multiple video. Preferably, the video processing FPGA further includes: an arbitration module;
所述仲裁模块, 配置为通过 round-robin机制从所述帧緩存模块緩存的 多路视频中依次选取要存进内存空间的视频;  The arbitration module is configured to sequentially select a video to be stored in the memory space from the multi-channel video buffered by the frame buffer module by using a round-robin mechanism;
相应的, 所述内存控制器, 配置为将所述仲裁模块选取的视频依次存 进相应的内存空间。  Correspondingly, the memory controller is configured to sequentially store the video selected by the arbitration module into a corresponding memory space.
较佳地, 所述帧緩存模块由一个 one-hot状态机构成, 每个状态对应一 帧数据。  Preferably, the frame buffer module is composed of a one-hot state machine, and each state corresponds to one frame of data.
本发明实施例提供了一种视频多画面合成系统, 所述系统包括: 解码 模块和视频处理 FPGA, 其中,  An embodiment of the present invention provides a video multi-view synthesis system, where the system includes: a decoding module and a video processing FPGA, where
所述解码模块, 配置为按照多画面布局的要求, 确定自身解码的多路 视频各自对应的地址, 并将解码得到的多路视频和确定的多路视频各自对 应的地址, 通过高速串行总线发给视频处理 FPGA;  The decoding module is configured to determine, according to the requirements of the multi-screen layout, respective addresses corresponding to the multi-channel videos decoded by the decoder, and decode the corresponding addresses of the multi-channel video and the determined multi-channel video through the high-speed serial bus. Send to the video processing FPGA;
所述视频处理 FPGA, 配置为通过高速串行总线,接收解码模块发来的 多路视频和各自对应的地址, 各路视频的地址为解码模块按照多画面布局 的要求确定的;  The video processing FPGA is configured to receive the multi-channel video and the corresponding address sent by the decoding module through the high-speed serial bus, and the address of each video is determined by the decoding module according to the requirement of the multi-screen layout;
将收到的多路视频进行缩放, 缩放后的各路视频的大小分别与多画面 中相应子画面的大小相同;  The received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
緩存缩放后的各路视频, 并分别对緩存的各路视频对应的地址进行修 正; 由上可知, 本发明实施例的技术方案包括: 视频处理现场可编程门阵 列 (FPGA )通过高速串行总线, 接收解码模块发来的多路视频和各自对应 的地址, 各路视频的地址为解码模块按照多画面布局的要求确定的; 将收 到的多路视频进行缩放, 缩放后的各路视频的大小分别与多画面中相应子 画面的大小相同; 緩存缩放后的各路视频, 并分别对緩存的各路视频对应 内存空间, 由此, 通过高速串行总线传输数据, 能够节省系统资源, 而且 提高数据传输速度和图像质量。 附图说明 The scaled video is cached, and the addresses corresponding to the cached videos are respectively corrected. As can be seen from the above, the technical solution of the embodiment of the present invention includes: a video processing field programmable gate array (FPGA) through a high speed serial bus Receiving multiple channels of video and corresponding addresses sent by the decoding module, and the addresses of the respective channels are determined by the decoding module according to the requirements of the multi-screen layout; the received multiple channels of video are scaled, and the scaled videos are respectively The size is the same as the size of the corresponding sub-picture in the multi-screen; the cached video is cached, and corresponding to each cached video The memory space, by which data is transferred over the high-speed serial bus, saves system resources and increases data transfer speed and image quality. DRAWINGS
图 1为现有高清会议电视终端的硬件架构示意图;  1 is a schematic diagram of a hardware architecture of an existing HD conference television terminal;
图 2为本发明视频多画面合成方法的第一实施例的实现流程示意图; 图 3为本发明视频处理 FPGA的实施例的结构示意图;  2 is a schematic flowchart of an implementation of a video multi-screen synthesis method according to a first embodiment of the present invention; FIG. 3 is a schematic structural diagram of an embodiment of a video processing FPGA according to the present invention;
图 4为本发明解码模块的实施例的结构示意图;  4 is a schematic structural diagram of an embodiment of a decoding module according to the present invention;
图 5为本发明视频多画面合成系统的实施例的结构示意图;  FIG. 5 is a schematic structural diagram of an embodiment of a video multi-picture synthesizing system according to the present invention; FIG.
图 6为本发明视频多画面合成方法的第二实施例的实现流程示意图; 图 7为本发明实施例三路子画面合成结构示意图。 具体实施方式 本发明提供的一种视频多画面合成方法的第一实施例, 如图 2 所示, 所述方法包括:  FIG. 6 is a schematic diagram of an implementation flow of a second embodiment of a video multi-picture synthesis method according to the present invention; FIG. 7 is a schematic diagram of a three-way sub-picture synthesis structure according to an embodiment of the present invention. A first embodiment of a video multi-picture synthesis method is provided by the present invention. As shown in FIG. 2, the method includes:
步骤 201、视频处理 FPGA通过高速串行总线,接收解码模块发来的多 路视频和各自对应的地址, 各路视频的地址为解码模块按照多画面布局的 要求确定的;  Step 201: The video processing FPGA receives the multi-channel video and the corresponding address sent by the decoding module through the high-speed serial bus, and the address of each video is determined by the decoding module according to the requirement of the multi-screen layout;
步骤 202、将收到的多路视频进行缩放, 缩放后的各路视频的大小分别 与多画面中相应子画面的大小相同;  Step 202: The received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
步骤 203、緩存缩放后的各路视频, 并分别对緩存的各路视频对应的地 址进行修正; 空间。  Step 203: Cache the scaled video, and correct the address corresponding to each cached video; space.
优选地, 所述将收到的多路视频进行缩放之前, 所述方法还包括: 将解码模块通过高速串行总线发来的数据进行解串处理, 解析出有效 数据, 并对所述有效数据进行并行处理, 得到并行数据。 Preferably, before the scaling the received multiple video, the method further includes: The data sent from the decoding module through the high-speed serial bus is subjected to deserialization processing, the valid data is parsed, and the valid data is processed in parallel to obtain parallel data.
优选地, 所述将收到的多路视频进行缩放, 为:  Preferably, the scaling the received multiple video is:
根据对图像质量的要求选择临近域插值算法、 双线性内插算法或多相 位插值算法, 将收到的多路视频进行缩放。  The received multi-channel video is scaled according to the image quality requirement by selecting a neighborhood interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm.
优选地, 所述根据修正后的地址将各路视频分别存进相应的内存空间 之前, 所述方法还包括:  Preferably, before the storing the video in the corresponding memory space according to the modified address, the method further includes:
通过轮询( round-robin )机制从緩存的多路视频中依次选取要存进内存 空间的视频;  Selecting a video to be stored in the memory space from the cached multi-channel video by a round-robin mechanism;
相应的, 所述将各路视频分别存进相应的内存空间, 为:  Correspondingly, the video is stored in the corresponding memory space, as follows:
将选取的视频依次存进相应的内存空间。  The selected videos are sequentially stored in the corresponding memory space.
本发明提供的一种视频处理 FPGA的实施例, 如图 3所示, 所述视频 处理 FPGA包括:  An embodiment of a video processing FPGA provided by the present invention, as shown in FIG. 3, the video processing FPGA includes:
高速串行总线控制器 301, 配置为通过高速串行总线,接收解码模块发 来的多路视频和各自对应的地址, 各路视频的地址为解码模块按照多画面 布局的要求确定的;  The high-speed serial bus controller 301 is configured to receive the multi-channel video and the corresponding address sent by the decoding module through the high-speed serial bus, and the address of each video is determined by the decoding module according to the requirement of the multi-screen layout;
缩放模块 302,配置为将高速串行总线控制器 301收到的多路视频进行 缩放, 缩放后的各路视频的大小分别与多画面中相应子画面的大小相同; 帧緩存模块 303, 配置为緩存缩放后的各路视频, 并分别对緩存的各路 视频对应的地址进行爹正;  The scaling module 302 is configured to scale the multi-channel video received by the high-speed serial bus controller 301, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen; the frame buffer module 303 is configured to Cache each of the scaled videos, and respectively correct the addresses corresponding to the cached videos;
内存控制器 304,配置为根据帧緩存模块 303修正后的地址将缩放后的 各路视频分别存进相应的内存空间。  The memory controller 304 is configured to store the scaled videos in the corresponding memory space according to the corrected address of the frame buffer module 303.
优选地, 所述高速串行总线控制器 301, 还配置为将解码模块通过高速 串行总线发来的数据进行解串处理, 解析出有效数据, 并对所述有效数据 进行并行处理, 得到并行数据。 优选地, 所述缩放模块 302, 具有配置为根据对图像质量的要求选择临 近域插值算法、 双线性内插算法或多相位插值算法, 将收到的多路视频进 行缩放。 Preferably, the high-speed serial bus controller 301 is further configured to perform deserialization processing on data sent by the decoding module through the high-speed serial bus, parse the valid data, and perform parallel processing on the valid data to obtain parallel data. Preferably, the scaling module 302 has a configured to select a neighboring domain interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm according to the image quality requirement, and scale the received multiple video.
优选地, 所述视频处理 FPGA还包括: 仲裁模块 305;  Preferably, the video processing FPGA further includes: an arbitration module 305;
所述仲裁模块 305, 配置为通过轮询(round-robin )机制从所述帧緩存 模块 303緩存的多路视频中依次选取要存进内存空间的视频;  The arbitration module 305 is configured to sequentially select a video to be stored in the memory space from the multi-channel video buffered by the frame buffer module 303 by using a round-robin mechanism;
这里, 由于内存同一时刻只能进行一路读或者写的操作, 而同时会有 多路对内存发出读或者写请求, 因此就需要对多路读写请求进行仲裁, 以 决定当前授权哪一路的请求。 仲裁模块 305采用独热 (one-hot )编码状态 机, 每个状态代表一路请求, 仲裁采用轮询机制, 以保证各路请求得到公 平和及时的响应。  Here, since the memory can only perform one read or write operation at the same time, and there are multiple ways to issue a read or write request to the memory, it is necessary to arbitrate the multiple read and write requests to determine which request is currently authorized. . The arbitration module 305 uses a one-hot encoding state machine, each state represents a request, and the arbitration uses a polling mechanism to ensure that the requests are fair and timely.
相应的, 所述内存控制器 304, 具体配置为将所述仲裁模块 305选取的 视频依次存进相应的内存空间。  Correspondingly, the memory controller 304 is specifically configured to sequentially store the video selected by the arbitration module 305 into a corresponding memory space.
优选地, 所述帧緩存模块 303由一个 one-hot状态机构成, 每个状态对 应一帧数据。  Preferably, the frame buffer module 303 is composed of a one-hot state machine, and each state corresponds to one frame of data.
这里, 由于各路视频制式有可能不同, 编解码的制式也有可能不同, 因此在对视频进行緩存的时候开辟三帧的緩存空间, 并对三帧的状态进行 标记。 假定当前第一帧状态为空 (empty ), 当高速串行总线控制器 301 往 帧緩存模块 303 写入视频数据, 同时帧緩存模块 303 发现第一帧状态为 empty时, 跳转进入第一帧的状态, 写满第一帧之后, 第一帧就被标记为满 ( full ), 表示该帧已经写满视频数据, 可以被读取了。 这个时候帧緩存模块 303判断下一帧的状态是否为 empty, 如果为 empty, 在解码模块发送数据 的时候跳转到第二帧开始写。 如果第二帧的状态为 full, 说明第二帧数据也 是满的, 正在被帧读取模 306块读取, 那么帧緩存模块 303继续保持在第 一帧的状态, 当解码模块通过高速串行总线控制器 301 写入视频数据的时 候, 帧緩存模块 303就会把原来第一帧的数据覆盖掉; 帧緩存模块 303是 和帧读取模块 306配合一起完成了丟帧的帧频转换功能。 Here, since the video formats may be different, the codec format may be different. Therefore, when the video is buffered, three frames of buffer space are opened, and the state of the three frames is marked. Assuming that the current first frame state is empty, when the high speed serial bus controller 301 writes video data to the frame buffer module 303, and the frame buffer module 303 finds that the first frame state is empty, the jump enters the first frame. The state, after the first frame is written, the first frame is marked as full, indicating that the frame has been filled with video data and can be read. At this time, the frame buffer module 303 determines whether the state of the next frame is empty, and if it is empty, jumps to the second frame to start writing when the decoding module transmits data. If the state of the second frame is full, indicating that the second frame data is also full and is being read by the frame read mode 306 block, then the frame buffer module 303 continues to remain in the state of the first frame, when the decoding module passes the high speed serial When the bus controller 301 writes video data The frame buffer module 303 overwrites the data of the original first frame. The frame buffer module 303 cooperates with the frame reading module 306 to complete the frame rate conversion function of the frame loss.
帧读取模 306, 配置为从内存中读取合成之后的视频多画面, 帧读取模 块 306的基本架构和帧緩存模块 303相同, 也是由 one-hot状态机构成的。 当编码模块通过高速串行总线控制器 301来读取相应地址的视频多画面时, 帧读取模块 306在状态机的控制下从其緩存的三帧数据中选择一帧进行读 取。只有被标记为 full的帧才能被读取, 而读取完之后把这一帧的状态置为 empty。假如当前刚刚读取完成第一帧数据,编码模块又发送了读取的命令, 这个时候帧读取模块 306判断下一帧数据的状态是否为 full, 如果为 full, 说明里面有一帧刚刚写好的数据可以读取, 那么帧读取模块 306跳到下一 帧的状态, 并读取数据。 如果为 empty, 说明这一帧数据还没有准备好, 那 么帧读取模块 306保持当前状态, 并把刚刚读取过的这一帧数据再读一遍, 这样就完成了一次帧拷贝的帧频转换功能。  The frame read module 306 is configured to read the synthesized video multi-picture from the memory. The basic structure of the frame read module 306 is the same as the frame buffer module 303, and is also composed of a one-hot state machine. When the encoding module reads the video multi-picture of the corresponding address through the high-speed serial bus controller 301, the frame reading module 306 selects one frame from its buffered three-frame data for reading under the control of the state machine. Only frames marked as full can be read, and the status of this frame is set to empty after reading. If the first frame data is just read, the encoding module sends the read command again. At this time, the frame reading module 306 determines whether the state of the next frame data is full. If it is full, it indicates that one frame has just been written. The data can be read, then the frame read module 306 skips to the state of the next frame and reads the data. If it is empty, indicating that the frame data is not ready yet, the frame reading module 306 maintains the current state, and reads the frame data just read again, thus completing the frame rate conversion of the frame copy. Features.
本发明提供的一种解码模块的实施例, 如图 4所示, 所述解码模块包 括: 地址确定单元 401, 配置为按照多画面布局的要求, 确定解码后的多路 视频各自对应的地址;  An embodiment of a decoding module provided by the present invention, as shown in FIG. 4, the decoding module includes: an address determining unit 401, configured to determine, according to a requirement of a multi-screen layout, an address corresponding to each of the decoded multi-channel videos;
发送单元 402,配置为将解码后的多路视频和确定的多路视频各自对应 的地址, 通过高速串行总线发给视频处理 FPGA。  The transmitting unit 402 is configured to send the address corresponding to each of the decoded multi-channel video and the determined multi-channel video to the video processing FPGA through the high-speed serial bus.
本发明提供的一种视频多画面合成系统的实施例, 如图 5 所示, 所述 系统包括: 解码模块 501和视频处理 FPGA502, 其中,  The embodiment of the present invention provides a video multi-screen synthesis system. As shown in FIG. 5, the system includes: a decoding module 501 and a video processing FPGA 502, where
所述解码模块 501, 配置为按照多画面布局的要求, 确定自身解码的多 路视频各自对应的地址, 并将解码得到的多路视频和确定的多路视频各自 对应的地址, 通过高速串行总线发给视频处理 FPGA502;  The decoding module 501 is configured to determine, according to the requirements of the multi-screen layout, respective addresses corresponding to the multiple channels of video decoded by the decoder, and decode the obtained multiple channels of video and the corresponding addresses of the determined multiple channels of video through high-speed serial The bus is sent to the video processing FPGA 502;
所述视频处理 FPGA502,配置为通过高速串行总线,接收解码模块 501 发来的多路视频和各自对应的地址, 各路视频的地址为解码模块 501 按照 多画面布局的要求确定的; The video processing FPGA 502 is configured to receive the multi-channel video and the corresponding address sent by the decoding module 501 through the high-speed serial bus, and the address of each video is the decoding module 501. The requirements of the multi-screen layout are determined;
将收到的多路视频进行缩放, 缩放后的各路视频的大小分别与多画面 中相应子画面的大小相同;  The received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
緩存缩放后的各路视频, 并分别对緩存的各路视频对应的地址进行修 正; 下面结合图 6对本发明提供的一种视频多画面合成方法的第二实施例 进行介绍。 本实施例中, 编、 解码模块使用数字信号处理器(Digital Signal Processor, DSP ) TMS320TCI6608 , 所述 TMS320TCI6608是多核定点 /浮点 DSP, 主频达到 1.25G, 可以同时编码或者解码两路 1080P60制式的视频, 支持 rapioIO高速串行总线; 视频处理 FPGA使用 EP4S110GXF1120, 所述 EP4S110GXF1120内嵌 32个串行收发器, 可以实现 PCIe、 rapidIO等多个 高速串行协议。本实施例中, DSP和视频处理 FPGA之间使用 rapidIO互联, 传输视频数据, 高速串行总线控制器为 RapidIO控制器。 视频处理 FPGA 的 rapidIO在 4x配置情况下可以支持到 3.125G, 这样总带宽是 12.5G, 除 去协议开销,有效带宽 10G。一路 1080P60制式的视频的有效数据带宽 2G, 所以足够传输五路 1080P60原始有效数据。 视频处理 FPGA外挂四片双倍 速率同步动态随机存储器 3 ( Double Data Rate, DDR3 ), 每片 DDR3内存 16位 2Gbits, 速率为 800Mbps, 这样总内存带宽为 51.2Gbps。 假定要实现 一个品字形的三画面图像合成, 编解码是视频制式都是 1080P30。  The scaled video is cached, and the addresses corresponding to the cached videos are respectively corrected. A second embodiment of the video multi-picture synthesis method provided by the present invention is described below with reference to FIG. In this embodiment, the encoding and decoding module uses a digital signal processor (DSP) TMS320TCI6608, and the TMS320TCI6608 is a multi-core fixed-point/floating point DSP with a frequency of up to 1.25G, which can simultaneously encode or decode two 1080P60 formats. Video, support rapioIO high-speed serial bus; video processing FPGA uses EP4S110GXF1120, the EP4S110GXF1120 embedded 32 serial transceivers, can achieve PCIe, rapidIO and other high-speed serial protocols. In this embodiment, the rapidIO interconnection is used between the DSP and the video processing FPGA to transmit video data, and the high-speed serial bus controller is a RapidIO controller. Video Processing FPGA's rapidIO can support up to 3.125G in a 4x configuration, so the total bandwidth is 12.5G, except for protocol overhead, with an effective bandwidth of 10G. One channel of 1080P60 video has a valid data bandwidth of 2G, so it is enough to transmit five channels of 1080P60 raw valid data. Video processing FPGA external four-speed double-rate synchronous dynamic random access memory 3 ( Double Data Rate, DDR3), each DDR3 memory 16-bit 2Gbits, the rate is 800Mbps, so the total memory bandwidth is 51.2Gbps. It is assumed that a three-picture image synthesis of a character shape is realized, and the codec is a video system of 1080P30.
步骤 601、视频处理 FPGA通过高速串行总线,接收解码模块发来的多 路视频和各自对应的地址, 各路视频的地址为解码模块按照多画面布局的 要求确定的;  Step 601: The video processing FPGA receives the multi-channel video and the corresponding address sent by the decoding module through the high-speed serial bus, and the address of each video is determined by the decoding module according to the requirement of the multi-screen layout;
这里, 使用高速串行总线可以有效的节省 PCB的布线空间, 而且高速 串行总线的带宽远大于 VP接口。目前中端的视频处理 FPGA的高速串行总 线接口可以实现 4x 3.125G = 12.5Gbps的传输速率, 可以传输四路 1080P60 制式的有效视频, 而 VP接口最高只能传输一路 1080P60制式的有效视频。 Here, the high-speed serial bus can effectively save the wiring space of the PCB, and the bandwidth of the high-speed serial bus is much larger than that of the VP interface. High-speed serial total of mid-range video processing FPGAs The line interface can achieve 4x 3.125G = 12.5Gbps transmission rate, can transmit four channels of 1080P60 standard effective video, and the VP interface can only transmit one channel of 1080P60 standard effective video.
步骤 602、将解码模块通过高速串行总线发来的数据进行解串处理, 解 析出有效数据, 并对所述有效数据进行并行处理, 得到并行数据。  Step 602: Perform deserialization processing on the data sent by the decoding module through the high-speed serial bus, analyze the valid data, and perform parallel processing on the valid data to obtain parallel data.
步骤 603、将收到的多路视频进行缩放, 缩放后的各路视频的大小分别 与多画面中相应子画面的大小相同;  Step 603: The received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen;
具体的, 根据对图像质量的要求选择临近域插值算法、 双线性内插算 法或多相位插值算法, 将收到的多路视频进行缩放。 如果对图像质量要求 不高, 选择临近域插值算法; 如果对图像质量要求较高, 选择双线性内插 算法; 如果对图像质量要求很高, 多相位插值算法;  Specifically, the received multi-channel video is scaled according to the requirement of image quality by selecting a neighborhood interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm. If the image quality is not high, select the neighborhood interpolation algorithm; if the image quality is high, choose the bilinear interpolation algorithm; if the image quality is very high, the multi-phase interpolation algorithm;
这里, 为了平衡性能和复杂度, 可以采用双线性算法, 双线性算法虽 然会使图像产生一定的振铃效果, 但是在会议电视应用场景下其图像质量 已经能够满足要求了。 而且双线性算法只需要使用原始图像中的 4个像素 就可以产生目标图像中的一个像素, 运算量和复杂度都相对较小。 本例中 由于输出都是 1080P30制式的视频, 多画面为品字形的三画面叠加, 所以 每路子画面的行列都是原始图像行列的一半, 所以缩放比例为 1/2。  Here, in order to balance performance and complexity, a bilinear algorithm can be used. Although the bilinear algorithm will produce a certain ringing effect on the image, its image quality can meet the requirements in the conference television application scenario. Moreover, the bilinear algorithm only needs to use 4 pixels in the original image to generate one pixel in the target image, and the amount of computation and complexity are relatively small. In this example, since the output is 1080P30 video, the multi-picture is a three-picture superposition of the character shape, so the line and column of each sub-picture is half of the original image line, so the scaling is 1/2.
步骤 604、緩存缩放后的各路视频, 并分别对緩存的各路视频对应的地 址进行修正;  Step 604: Cache the scaled videos, and correct the addresses corresponding to the cached videos.
这里, 由于缩放之后图像点数发生了变化, 因此地址也需要重新进行 映射。 本例中缩放比例为 1/2, 所以行列地址各只有原始图像的一半, 根据 缩放之后子画面的大小和在多画面中起始位置重新计算每个像素的存放地 址, 从而把三路子画面准确存放到图 7中远端 1、 远端 2和远端 3的位置; 由于 DSP传送视频的速率和 DDR3内存的速率不一致,为了提高 DDR3 内存的存储效率, 使用乒乓緩存( Ping Qong Buffer )方法对视频进行緩存, 即在视频处理 FPGA 内部开辟两个行緩存的随机存储器 (Random Access Memory, RAM ), 当解码 DSP传送来的视频写完第一个行 RAM的时候, 就转而写第二个行 RAM, 同时产生一个写请求信号给仲裁模块, 得到仲裁 的授权响应之后,就把第一个行 RAM存储的数据写入 DDR3内存中。 当第 二个行 RAM 写满数据之后又切换到第一行 RAM 来写, 这样就实现了 PINGPONG的操作方式, 能够提高 DDR3内存的存储效率, 当三路子画面 全部存储到相应的位置之后就实现了多画面的合成过程。 Here, since the number of image points changes after scaling, the address also needs to be remapped. In this example, the scaling is 1/2, so each row and column address is only half of the original image. The storage address of each pixel is recalculated according to the size of the sub-picture after scaling and the starting position in the multi-picture, so that the three-way sub-picture is accurate. Stored in the location of the remote 1, remote 2 and remote 3 in Figure 7; because the rate of video transmission by the DSP and the rate of DDR3 memory are inconsistent, in order to improve the storage efficiency of DDR3 memory, use the Ping Qong Buffer method. The video is cached, that is, a random access memory with two line buffers inside the video processing FPGA (Random Access) Memory, RAM), when the video transmitted by the decoding DSP writes the first line of RAM, it writes the second line of RAM, and at the same time generates a write request signal to the arbitration module, and after obtaining the authorization response of the arbitration, Write the data stored in the first row of RAM to DDR3 memory. When the second row RAM is full of data and then switched to the first row of RAM to write, this realizes the operation mode of PINGPONG, which can improve the storage efficiency of DDR3 memory, and realize when the three-way sub-pictures are all stored to the corresponding positions. The synthesis process of multi-picture.
当三路子画面都存储完一帧之后, 跳转到下一帧执行相同的操作。 这 时如果编码 DSP需要新的一帧多画面进行编码, 通过 RapidIO控制器下发 读命令, 帧读取模块判断到已经有一帧完整的多画面存储完成之后, 读取 这一帧数据, 帧读取模块内部也采用 PINGPONG buffer方法来同步 DDR3 内存和 RapidIO控制器的速率, 提高 DDR3 内存的读取效率; 当写满一个 行 RAM的数据之后就通过 RapidIO控制器送给编码 DSP,当一整帧数据都 读取完成之后, 又等待编码 DSP新的读取命令。  After the three-way sub-picture has stored one frame, jump to the next frame to perform the same operation. At this time, if the encoding DSP needs a new one-frame multi-picture encoding, the reading command is issued by the RapidIO controller, and the frame reading module judges that after one frame of the complete multi-picture storage is completed, the frame data is read, and the frame reading is performed. The PINGPONG buffer method is also used internally to synchronize the speed of DDR3 memory and RapidIO controller, and improve the read efficiency of DDR3 memory. When the data of one row RAM is filled, it is sent to the encoding DSP through the RapidIO controller. After the data is read, it waits for the DSP to read the new read command.
步骤 605、 通过轮询( round-robin )机制从緩存的多路视频中依次选取 要存进内存空间的视频。  Step 605: Select, by using a round-robin mechanism, the video to be stored in the memory space from the cached multi-channel video.
步骤 606、根据修正后的地址,将选取的视频依次存进相应的内存空间。 以上所述, 仅为本发明的较佳实施例而已, 并非用于限定本发明的保 护范围。 工业实用性  Step 606: Store the selected videos into the corresponding memory space according to the corrected address. The above is only the preferred embodiment of the present invention and is not intended to limit the scope of the present invention. Industrial applicability
本发明提供一种视频多画面合成方法、 装置和系统, 其中, 所述方法 包括: 视频处理现场可编程门阵列通过高速串行总线, 接收解码模块发来 的多路视频和各自对应的地址, 各路视频的地址为解码模块按照多画面布 局的要求确定的; 将收到的多路视频进行缩放, 缩放后的各路视频的大小 分别与多画面中相应子画面的大小相同; 緩存缩放后的各路视频, 并分别 对緩存的各路视频对应的地址进行修正; 根据修正后的地址将缩放后的各 路视频分别存进相应的内存空间。 本发明能够节省系统资源, 而且提高数 据传输速度和图像质量。 The present invention provides a video multi-picture synthesis method, apparatus and system, wherein the method comprises: a video processing field programmable gate array receiving a multi-channel video and a corresponding address sent by a decoding module through a high-speed serial bus. The address of each video is determined by the decoding module according to the requirements of the multi-screen layout; the received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-screen; Each channel of the video, and respectively correct the address corresponding to each cached video; according to the corrected address will be scaled each The road videos are stored in the corresponding memory space. The invention can save system resources and improve data transmission speed and image quality.

Claims

权利要求书 claims
1、 一种视频多画面合成方法, 所述方法包括: 1. A video multi-picture synthesis method, the method includes:
视频处理现场可编程门阵列 (FPGA )通过高速串行总线, 接收解码模 块发来的多路视频和各自对应的地址, 各路视频的地址为解码模块按照多 画面布局的要求确定的; The video processing field programmable gate array (FPGA) receives multiple channels of video and their corresponding addresses from the decoding module through a high-speed serial bus. The address of each channel of video is determined by the decoding module in accordance with the requirements of the multi-screen layout;
将收到的多路视频进行缩放, 缩放后的各路视频的大小分别与多画面 中相应子画面的大小相同; The received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-picture;
緩存缩放后的各路视频, 并分别对緩存的各路视频对应的地址进行修 正; Cache the scaled videos of each channel, and modify the addresses corresponding to the cached videos respectively;
2、 根据权利要求 1所述的方法, 其中, 所述将收到的多路视频进行缩 放之前, 所述方法还包括: 2. The method according to claim 1, wherein before scaling the received multi-channel video, the method further includes:
将解码模块通过高速串行总线发来的数据进行解串处理, 解析出有效 数据, 并对所述有效数据进行并行处理, 得到并行数据。 The data sent from the decoding module through the high-speed serial bus is deserialized, parsed out the valid data, and the valid data is processed in parallel to obtain parallel data.
3、 根据权利要求 1所述的方法, 其中, 所述将收到的多路视频进行缩 放, 为: 3. The method according to claim 1, wherein the scaling of the received multi-channel videos is:
根据对图像质量的要求选择临近域插值算法、 双线性内插算法或多相 位插值算法, 将收到的多路视频进行缩放。 According to the requirements for image quality, select the adjacent domain interpolation algorithm, bilinear interpolation algorithm or multi-phase interpolation algorithm to scale the received multi-channel video.
4、 根据权利要求 1所述的方法, 其中, 所述根据修正后的地址将各路 视频分别存进相应的内存空间之前, 所述方法还包括: 4. The method according to claim 1, wherein before storing each video into the corresponding memory space according to the corrected address, the method further includes:
通过轮询( round-robin )机制从緩存的多路视频中依次选取要存进内存 空间的视频; The videos to be stored in the memory space are sequentially selected from the cached multi-channel videos through a round-robin mechanism;
相应的, 所述将各路视频分别存进相应的内存空间, 为: Correspondingly, the above-mentioned method of storing each channel of video into the corresponding memory space is:
将选取的视频依次存进相应的内存空间。 Save the selected videos into the corresponding memory space in sequence.
5、 一种视频处理现场可编程门阵列 (FPGA ), 所述视频处理 FPGA包 括: 5. A video processing field programmable gate array (FPGA), the video processing FPGA includes:
高速串行总线控制器, 配置为通过高速串行总线, 接收解码模块发来 的多路视频和各自对应的地址, 各路视频的地址为解码模块按照多画面布 局的要求确定的; The high-speed serial bus controller is configured to receive multiple channels of video and their corresponding addresses from the decoding module through the high-speed serial bus. The addresses of each channel of video are determined by the decoding module in accordance with the requirements of the multi-screen layout;
缩放模块, 配置为将高速串行总线控制器收到的多路视频进行缩放, 缩放后的各路视频的大小分别与多画面中相应子画面的大小相同; A scaling module configured to scale the multi-channel video received by the high-speed serial bus controller, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-picture;
帧緩存模块, 配置为緩存缩放后的各路视频, 并分别对緩存的各路视 频对应的地址进行爹正; The frame cache module is configured to cache each channel of video after scaling, and correct the address corresponding to each channel of cached video respectively;
内存控制器, 配置为根据帧緩存模块修正后的地址将缩放后的各路视 频分别存进相应的内存空间。 The memory controller is configured to store each scaled video into the corresponding memory space according to the corrected address of the frame buffer module.
6、 根据权利要求 5所述的视频处理 FPGA, 其中, 所述高速串行总线 控制器, 还配置为将解码模块通过高速串行总线发来的数据进行解串处理, 解析出有效数据, 并对所述有效数据进行并行处理, 得到并行数据。 6. The video processing FPGA according to claim 5, wherein the high-speed serial bus controller is further configured to deserialize the data sent from the decoding module through the high-speed serial bus, parse out the valid data, and The effective data is processed in parallel to obtain parallel data.
7、 根据权利要求 5所述的视频处理 FPGA, 其中, 所述缩放模块, 具 有配置为根据对图像质量的要求选择临近域插值算法、 双线性内插算法或 多相位插值算法, 将收到的多路视频进行缩放。 7. The video processing FPGA according to claim 5, wherein the scaling module is configured to select a nearby domain interpolation algorithm, a bilinear interpolation algorithm or a multi-phase interpolation algorithm according to the requirements for image quality, and will receive multi-channel video scaling.
8、根据权利要求 5所述的视频处理 FPGA,其中,所述视频处理 FPGA 还包括: 仲裁模块; 8. The video processing FPGA according to claim 5, wherein the video processing FPGA further includes: an arbitration module;
所述仲裁模块, 配置为通过轮询(round-robin )机制从所述帧緩存模块 緩存的多路视频中依次选取要存进内存空间的视频; The arbitration module is configured to sequentially select videos to be stored in the memory space from the multi-channel videos cached by the frame buffer module through a round-robin mechanism;
相应的, 所述内存控制器, 配置为将所述仲裁模块选取的视频依次存 进相应的内存空间。 Correspondingly, the memory controller is configured to store the videos selected by the arbitration module into the corresponding memory space in sequence.
9、 根据权利要求 8所述的视频处理 FPGA, 其中, 所述帧緩存模块由 一个 one-hot状态机构成, 每个状态对应一帧数据。 9. The video processing FPGA according to claim 8, wherein the frame buffer module is composed of a one-hot state machine, and each state corresponds to one frame of data.
10、 一种解码模块, 所述解码模块包括: 10. A decoding module, the decoding module includes:
地址确定单元, 配置为按照多画面布局的要求, 确定解码后的多路视 频各自对应的地址; The address determination unit is configured to determine the respective addresses of the decoded multi-channel videos according to the requirements of the multi-screen layout;
发送单元, 配置为将解码后的多路视频和确定的多路视频各自对应的 地址, 通过高速串行总线发给视频处理现场可编程门阵列 (FPGA )。 The sending unit is configured to send the decoded multi-channel video and the corresponding addresses of the determined multi-channel video to the video processing field programmable gate array (FPGA) through the high-speed serial bus.
11、 一种视频多画面合成系统, 所述系统包括: 解码模块和视频处理 现场可编程门阵列 (FPGA ), 其中, 11. A video multi-picture synthesis system, the system includes: a decoding module and a video processing field programmable gate array (FPGA), where,
所述解码模块, 配置为按照多画面布局的要求, 确定自身解码的多路 视频各自对应的地址, 并将解码得到的多路视频和确定的多路视频各自对 应的地址, 通过高速串行总线发给视频处理 FPGA; The decoding module is configured to determine the corresponding addresses of the multi-channel videos decoded by itself in accordance with the requirements of the multi-screen layout, and transmit the decoded multi-channel videos and the determined addresses of the multi-channel videos through the high-speed serial bus. Sent to video processing FPGA;
所述视频处理 FPGA, 配置为通过高速串行总线,接收解码模块发来的 多路视频和各自对应的地址, 各路视频的地址为解码模块按照多画面布局 的要求确定的; The video processing FPGA is configured to receive multiple channels of video and their corresponding addresses from the decoding module through a high-speed serial bus. The address of each channel of video is determined by the decoding module in accordance with the requirements of the multi-screen layout;
将收到的多路视频进行缩放, 缩放后的各路视频的大小分别与多画面 中相应子画面的大小相同; The received multi-channel video is scaled, and the size of each scaled video is the same as the size of the corresponding sub-picture in the multi-picture;
緩存缩放后的各路视频, 并分别对緩存的各路视频对应的地址进行修 正; Cache the scaled videos of each channel, and modify the addresses corresponding to the cached videos respectively;
PCT/CN2013/086014 2012-11-23 2013-10-25 Method, device and system for synthesizing multi-screen video WO2014079303A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210482587.5A CN103841359A (en) 2012-11-23 2012-11-23 Video multi-image synthesizing method, device and system
CN201210482587.5 2012-11-23

Publications (1)

Publication Number Publication Date
WO2014079303A1 true WO2014079303A1 (en) 2014-05-30

Family

ID=50775512

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/086014 WO2014079303A1 (en) 2012-11-23 2013-10-25 Method, device and system for synthesizing multi-screen video

Country Status (2)

Country Link
CN (1) CN103841359A (en)
WO (1) WO2014079303A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109120867A (en) * 2018-09-27 2019-01-01 乐蜜有限公司 Image synthesizing method and device
CN109714502A (en) * 2019-01-16 2019-05-03 中科亿海微电子科技(苏州)有限公司 A kind of method of anti-interference LED Mosaic screen transmission of video

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104935831B (en) * 2015-06-12 2017-10-27 中国科学院自动化研究所 Parallel leggy image interpolation apparatus and method
CN105630446B (en) * 2015-12-23 2018-06-12 广州市天誉创高电子科技有限公司 A kind of processing system for video based on FPGA technology
CN105554416A (en) * 2015-12-24 2016-05-04 深圳市捷视飞通科技股份有限公司 FPGA (Field Programmable Gate Array)-based high-definition video fade-in and fade-out processing system and method
CN107257506A (en) * 2017-06-29 2017-10-17 徐文波 Many picture special efficacy loading methods and device
CN107682730B (en) * 2017-09-18 2020-02-07 北京嗨动视觉科技有限公司 Layer superposition processing method, layer superposition processing device and video processor
CN108769548A (en) * 2018-04-26 2018-11-06 深圳市微智体技术有限公司 A kind of decoding video output system and method
CN112437303A (en) * 2020-11-12 2021-03-02 北京深维科技有限公司 JPEG decoding method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1878260A (en) * 2006-07-14 2006-12-13 杭州国芯科技有限公司 Multi-menu co-screen playing method
CN101589619A (en) * 2007-11-20 2009-11-25 索尼株式会社 Information processing device, information processing method, display control device, display control method, and program

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742220B (en) * 2008-11-17 2011-12-28 中兴通讯股份有限公司 System and method for realizing multi-picture based on serial differential switch
CN201928357U (en) * 2010-11-24 2011-08-10 北京格非科技发展有限公司 Multi-format multi-picture separator
CN202160225U (en) * 2011-06-03 2012-03-07 北京彩讯科技股份有限公司 Digitized network video image synthesizer

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1878260A (en) * 2006-07-14 2006-12-13 杭州国芯科技有限公司 Multi-menu co-screen playing method
CN101589619A (en) * 2007-11-20 2009-11-25 索尼株式会社 Information processing device, information processing method, display control device, display control method, and program

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHEN, WENHUI ET AL.: "Synthesis System Design of Multi-video Based on FPGA.", MANUFACTURING AUTOMATION., vol. 32, no. 8, August 2010 (2010-08-01), pages 62 - 65 *
LI YOUXING ET AL.: "Design and Realization of SERDES Interface Base on FPGA.", PROCEEDINGS OF 5TH ANNUAL CONFERENCE OF CHINA INSTITUTE OF COMMUNICATIONS., February 2008 (2008-02-01), pages 11 - 14 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109120867A (en) * 2018-09-27 2019-01-01 乐蜜有限公司 Image synthesizing method and device
CN109714502A (en) * 2019-01-16 2019-05-03 中科亿海微电子科技(苏州)有限公司 A kind of method of anti-interference LED Mosaic screen transmission of video

Also Published As

Publication number Publication date
CN103841359A (en) 2014-06-04

Similar Documents

Publication Publication Date Title
WO2014079303A1 (en) Method, device and system for synthesizing multi-screen video
WO2009133671A1 (en) Video encoding and decoding device
TW583883B (en) System and method for multiple channel video transcoding
JP4993856B2 (en) Image conversion device, direct memory access device for image conversion, and camera interface supporting image conversion
US20070104377A1 (en) IMAGE CAPTURING APPARATUS AND IMAGE CAPTURING METHOD mmmmm
JP2594750B2 (en) Memory address control and display control device for high definition television
US10026146B2 (en) Image processing device including a progress notifier which outputs a progress signal
CN102843522B (en) The video-splicing transaction card of Based PC IE, its control system and control method
US7593580B2 (en) Video encoding using parallel processors
CN101577806A (en) Video terminal
US8179421B2 (en) Image synthesizing device and method and computer readable medium
KR100519133B1 (en) Image processor
CN101867808B (en) Method for accessing image data and relevant device thereof
CN112055159A (en) Image quality processing device and display apparatus
US7391932B2 (en) Apparatus and method for selecting image to be displayed
TWI559291B (en) Data buffering apparatus and related data buffering method
EP2382596B1 (en) A data processing apparatus for segmental processing of input data, systems using the apparatus and methods for data transmittal
CN102497514B (en) Three-channel video forwarding equipment and forwarding method
US9380260B2 (en) Multichannel video port interface using no external memory
CN114866733A (en) Low-delay video processing method, system and device
WO2003017664A1 (en) Method of realizing combination of multi-sets of multiple digital images and bus interface technique
WO2020022101A1 (en) Image processing device and image processing method
CN107241601B (en) Image data transmission method, device and terminal
CN103414898A (en) Method and system for collecting high-resolution video
JP2817409B2 (en) Color image signal decoding device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13856582

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13856582

Country of ref document: EP

Kind code of ref document: A1