WO2023029252A1 - Multi-viewpoint video data processing method, device, and storage medium - Google Patents

Multi-viewpoint video data processing method, device, and storage medium Download PDF

Info

Publication number
WO2023029252A1
WO2023029252A1 PCT/CN2021/134319 CN2021134319W WO2023029252A1 WO 2023029252 A1 WO2023029252 A1 WO 2023029252A1 CN 2021134319 W CN2021134319 W CN 2021134319W WO 2023029252 A1 WO2023029252 A1 WO 2023029252A1
Authority
WO
WIPO (PCT)
Prior art keywords
viewpoint
image
display device
image frame
images
Prior art date
Application number
PCT/CN2021/134319
Other languages
French (fr)
Chinese (zh)
Inventor
王荣刚
王振宇
高文
Original Assignee
北京大学深圳研究生院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学深圳研究生院 filed Critical 北京大学深圳研究生院
Publication of WO2023029252A1 publication Critical patent/WO2023029252A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440263Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present application discloses a multi-viewpoint video data processing method, a device, and a storage medium. The method comprises: when the viewpoint of a display device is switched, the display device sends to a decoding device a target viewpoint switched to; the decoding device provides, for the display device, an image, captured from an image frame, of a slave viewpoint corresponding to the target viewpoint; when a picture switching condition is satisfied, the display device sends a picture switching condition satisfaction instruction to the decoding device; according to the image switching condition satisfaction instruction, the decoding device provides, for the display device, an image in a main image frame sequence corresponding to the target viewpoint and/or the image, captured from the image frame, of the slave viewpoint corresponding to the target viewpoint.

Description

多视点视频数据处理方法、设备及存储介质Multi-view video data processing method, device and storage medium
相关申请related application
本申请要求于2021年09月02日提交中国专利局、申请号为202111035779.7、申请名称为“多视点视频数据处理方法、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims the priority of the Chinese patent application with the application number 202111035779.7 and the application name "Multi-Viewpoint Video Data Processing Method, Device and Storage Medium" submitted to the China Patent Office on September 2, 2021, the entire contents of which are incorporated by reference in application.
技术领域technical field
本申请涉及视频处理技术领域,尤其涉及一种多视点视频数据处理方法、设备及存储介质。The present application relates to the technical field of video processing, and in particular to a multi-viewpoint video data processing method, device and storage medium.
背景技术Background technique
自由视点技术是一种实现自由视角观看视频的技术。目前的应用自由视点技术的自由视点应用可以允许观看者在一定范围内以连续视点的形式观看视频。观看者可以设定视点的位置、角度,而不再局限于只能观看一个固定的摄像机视角拍摄的视频,实现了360°自由视角观看视频。The free viewpoint technology is a technology for viewing videos from a free viewpoint. The current free-viewpoint application using the free-viewpoint technology can allow viewers to watch videos in the form of continuous viewpoints within a certain range. The viewer can set the position and angle of the point of view, and is no longer limited to watching a video shot by a fixed camera angle of view, realizing a 360° free viewing angle to watch the video.
目前的自由视点应用往往使用空域拼接方法将多个视点的单路视频拼接在一起,当用户在自由视点应用端的进行视点切换时,自由视点应用通过拼接在一起的多个视点的单路视频为用户显示所切换视点对应的单路视频。但是,使用空域拼接方法对多个视点的单路视频拼接之后,导致各个视点的单路视频的分辨率下降,从而造成自由视点应用显示所需的画面分辨率不足,导致最终生成的视点画面分辨率不高。The current free-viewpoint applications often use the spatial stitching method to splice single-channel videos from multiple viewpoints together. When the user switches viewpoints on the free-viewpoint application side, the free-viewpoint application stitches together multiple viewpoints. Single-channel video is The user displays the single-channel video corresponding to the switched viewpoint. However, after splicing the single-channel video from multiple viewpoints using the spatial stitching method, the resolution of the single-channel video from each viewpoint will decrease, resulting in insufficient image resolution for the free-viewpoint application display, resulting in the resolution of the final generated viewpoint images. The rate is not high.
申请内容application content
本申请实施例通过提供一种多视点视频数据处理方法、设备及存储介质,旨在解决使用空域拼接方法对多个视点的单路视频拼接之后,导致各自由视点应用显示所需的画面分辨率不足,进而导致最终生成的视点画面分辨率不高的技术问题。The embodiment of the present application provides a multi-viewpoint video data processing method, device, and storage medium, aiming to solve the problem of the screen resolution required by each viewpoint application display after using the spatial domain splicing method to splicing multiple viewpoints of single-channel video. Insufficient, which in turn leads to the technical problem of low resolution of the final generated viewpoint picture.
本申请实施例提供了一种多视点视频数据处理方法,应用于解码设备,所述多视点视频数据处理方法包括:An embodiment of the present application provides a method for processing multi-viewpoint video data, which is applied to a decoding device. The method for processing multi-viewpoint video data includes:
接收到显示设备发送的显示指令时,获取所述显示指令对应的当前视点;When receiving a display instruction sent by a display device, acquiring a current viewpoint corresponding to the display instruction;
发送所述显示设备显示所述当前视点对应的当前视点画面所需的第一视频数据至所述显示设备;所述第一视频数据包括由所述当前视点对应的主传输路径接收到的主图像帧序列中的图像和/或由从传输路径接收到的从图像帧序列中的从图像帧中截取的所述当前视点对应的从视点的图像,所述主图像帧序列中的图像和所述从图像帧中的图像均包括视点画面和/或视点深度图画面,且所述主图像帧序列中的图像的分辨率大于所述从图像帧中的图像的分辨率;Sending the first video data required by the display device to display the current viewpoint picture corresponding to the current viewpoint to the display device; the first video data includes the main image received by the main transmission path corresponding to the current viewpoint The image in the frame sequence and/or the image of the secondary viewpoint corresponding to the current viewpoint intercepted from the secondary image frame in the secondary image frame sequence received from the transmission path, the image in the main image frame sequence and the The images in the slave image frames all include viewpoint pictures and/or viewpoint depth map pictures, and the resolution of the images in the main image frame sequence is greater than the resolution of the images in the slave image frames;
接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点;When receiving the viewpoint switching instruction sent by the display device, acquiring a target viewpoint corresponding to the viewpoint switching instruction;
发送所述显示设备显示所述目标视点对应的第一目标视点画面所需的第二视频数据至所述显示设备;所述第二视频数据包括由所述从图像帧序列中的从图像帧中截取的所述目标视点对应的从视点的图像;Sending the second video data required by the display device to display the first target viewpoint picture corresponding to the target viewpoint to the display device; the second video data includes the slave image frame in the slave image frame sequence An image from a viewpoint corresponding to the intercepted target viewpoint;
接收到所述显示设备发送的画面切换条件满足指令时,发送所述显示设备显示所述目标视点对应的第二目标视点画面所需的第三视频数据至所述显示设备;所述第三视频数据包括由所述目标视点对应的主传输路径接收到的主图像帧序列中的图像和/或由所述从图像帧序列中的从图像帧中截取的所述目标视点对应的从视点的图像。When receiving the screen switching condition satisfaction instruction sent by the display device, sending the third video data required by the display device to display the second target viewpoint picture corresponding to the target viewpoint to the display device; the third video The data includes an image in the main image frame sequence received by the main transmission path corresponding to the target viewpoint and/or an image of the secondary viewpoint corresponding to the target viewpoint intercepted from a secondary image frame in the secondary image frame sequence .
在一实施例中,所述发送所述显示设备显示所述当前视点对应的当前视点画面所需的第一视频数据至所述显示设备的步骤之前,还包括:In an embodiment, before the step of sending the first video data required by the display device to display the current viewpoint picture corresponding to the current viewpoint to the display device, it further includes:
获取所述当前视点对应的从视点的视点标识以及所述从图像帧序列的排布信息;Acquire the viewpoint identifier of the secondary viewpoint corresponding to the current viewpoint and the arrangement information of the secondary image frame sequence;
根据所述排布信息以及所述视点标识确定所述当前视点对应的从视点的图像在所述从图像帧中的位置信息;determining the position information of the secondary viewpoint image corresponding to the current viewpoint in the secondary image frame according to the arrangement information and the viewpoint identifier;
由所述从图像帧序列中的从图像帧中截取所述位置信息对应的图像。An image corresponding to the location information is intercepted from a secondary image frame in the sequence of secondary image frames.
在一实施例中,所述发送所述显示设备显示所述当前视点对应的当前视点画面所需的第一视频数据至所述显示设备的步骤包括:In an embodiment, the step of sending the first video data required by the display device to display the current viewpoint picture corresponding to the current viewpoint to the display device includes:
所述当前视点不为虚拟视点,发送由所述当前视点对应的主传输路径接收到的主图像帧序列中的图像至所述显示设备。The current viewpoint is not a virtual viewpoint, and the images in the main image frame sequence received by the main transmission path corresponding to the current viewpoint are sent to the display device.
在一实施例中,所述判断所述当前视点是否为虚拟视点的步骤之后还包括:In an embodiment, after the step of judging whether the current viewpoint is a virtual viewpoint, the step further includes:
所述当前视点为虚拟视点,发送由所述当前视点对应的主传输路径接收到的主图像帧序列中的图像和由从传输路径接收到的从图像帧序列中的从图像帧中截取的所述当前视点对应的从视点的图像至所述显示设备;所述当前视点对应的从视点包括所述当前视点相邻的从视点。The current viewpoint is a virtual viewpoint, and the images in the main image frame sequence received by the main transmission path corresponding to the current viewpoint and the images intercepted from the secondary image frame sequences in the secondary image frame sequence received by the secondary transmission path are sent. The image of the secondary viewpoint corresponding to the current viewpoint is sent to the display device; the secondary viewpoint corresponding to the current viewpoint includes secondary viewpoints adjacent to the current viewpoint.
在一实施例中,所述发送所述显示设备显示所述目标视点对应的第一目标视点画面所需的第二视频数据至所述显示设备步骤包括:In an embodiment, the step of sending the second video data required by the display device to display the first target viewpoint picture corresponding to the target viewpoint to the display device includes:
所述目标视点不为虚拟视点,发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相同的从视点的图像至所述显示设备。The target viewpoint is not a virtual viewpoint, and an image of a slave viewpoint identical to the target viewpoint intercepted from a slave image frame in the sequence of slave image frames is sent to the display device.
在一实施例中,所述所述目标视点不为虚拟视点,发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相同的从视点的图像至所述显示设备的步骤之后,所述接收到所述显示设备发送的画面切换条件满足指令时,发送所述显示设备显示所述目标视点对应的第二目标视点画面所需的第三视频数据至所述显示设备步骤包括:In an embodiment, the target viewpoint is not a virtual viewpoint, and an image of the slave viewpoint identical to the target viewpoint intercepted from the slave image frame in the sequence of slave image frames is sent to the display device After the step, when receiving the screen switching condition satisfaction instruction sent by the display device, sending the third video data required by the display device to display the second target viewpoint picture corresponding to the target viewpoint to the display device step include:
接收到所述显示设备发送的画面切换条件满足指令时,发送由所述目标视点对应的主传输路径接收到的主图像帧序列中的图像至所述显示设备。When the screen switching condition satisfaction instruction sent by the display device is received, the images in the main image frame sequence received by the main transmission path corresponding to the target viewpoint are sent to the display device.
在一实施例中,所述判断所述目标视点是否为虚拟视点的步骤之后还包括:In an embodiment, after the step of judging whether the target viewpoint is a virtual viewpoint, the step further includes:
所述目标视点为虚拟视点,发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相邻的从视点的图像至所述显示设备。The target viewpoint is a virtual viewpoint, and an image of a slave viewpoint adjacent to the target viewpoint intercepted from a slave image frame in the sequence of slave image frames is sent to the display device.
在一实施例中,所述所述目标视点为虚拟视点,发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相邻的从视点的图像至所述显示设备的步骤之后,所述接收到所述显示设备发送的画面切换条件满足指令时,发送所述显示设备显示所述目标视点对应的第二目标视点画面所需的第三视频数据至所述显示设备步骤还包括:In one embodiment, the target viewpoint is a virtual viewpoint, and the image of the secondary viewpoint adjacent to the target viewpoint intercepted from the secondary image frame in the secondary image frame sequence is sent to the display device After the step, when receiving the screen switching condition satisfaction instruction sent by the display device, sending the third video data required by the display device to display the second target viewpoint picture corresponding to the target viewpoint to the display device step Also includes:
接收到所述显示设备发送的画面切换条件满足指令时,发送由所述目标视点对应的主传输路径接收到的主图像帧序列中的图像和由所述从图像帧序列中的从图像帧中截取的与所述目标视点相邻的从视点的图像至所述显示设备至所述显示设备。When the screen switching condition satisfaction instruction sent by the display device is received, the image in the main image frame sequence received by the main transmission path corresponding to the target viewpoint and the image in the secondary image frame sequence received by the secondary image frame sequence are sent The intercepted images from viewpoints adjacent to the target viewpoint are sent to the display device to the display device.
此外,本申请还提供了一种多视点视频数据处理方法,应用于编码设备,所述多视点视频数据处理方法包括:In addition, the present application also provides a method for processing multi-viewpoint video data, which is applied to a coding device, and the method for processing multi-viewpoint video data includes:
获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像,所述图像包括视点画面和视点深度图画面中的至少一个;Acquiring images of various viewpoints taken by each camera, where different cameras take images corresponding to different viewpoints, and the images include at least one of a viewpoint picture and a viewpoint depth map picture;
对各个所述视点的图像进行拼接,并按照拍摄时间和第一分辨率对拼接后的图像进行编码,生成从图像帧序列;Stitching the images of each viewpoint, and encoding the spliced images according to the shooting time and the first resolution, to generate a sequence of secondary image frames;
按照所述拍摄时间和第二分辨率对每个所述视点的图像进行编码,生成每个视点的主图像帧序列,所述主图像帧序列中的图像的分辨率大于拼接后的图像中图像的分辨率;Encode the images of each viewpoint according to the shooting time and the second resolution to generate a main image frame sequence of each viewpoint, the resolution of the images in the main image frame sequence is greater than that of the images in the spliced images resolution;
接收到解码设备发送的视点选定指令时,根据视点选定指令获取所述解码设备选定的视点;When receiving the viewpoint selection instruction sent by the decoding device, acquire the viewpoint selected by the decoding device according to the viewpoint selection instruction;
将所述解码设备选定的视点的主图像帧序列由所述视点的主传输路径传输至所述解码设备,同时将所述从图像帧序列由从传输路径传输至所述解码设备。The main image frame sequence of the viewpoint selected by the decoding device is transmitted to the decoding device through the main transmission path of the viewpoint, and at the same time, the secondary image frame sequence is transmitted to the decoding device through the secondary transmission path.
在一实施例中,所述对各个所述视点的图像进行拼接,并按照拍摄时间和第一分辨率 对拼接后的图像进行编码,生成从图像帧序列的步骤包括:In one embodiment, the images of each of the viewpoints are spliced, and the spliced images are encoded according to the shooting time and the first resolution, and the steps of generating a sequence of image frames include:
采用预设排布方式对各个所述视点的图像进行拼接,生成拼接后的图像以及各个所述视点的图像在拼接后的图像中的排布信息,所述排布信息至少包括各个所述视点的视点标识和各个所述视点的图像在拼接后的图像中的位置信息;Stitching the images of each of the viewpoints in a preset arrangement manner to generate a spliced image and arrangement information of the images of each of the viewpoints in the spliced image, the arrangement information at least including each of the viewpoints The viewpoint identification and the position information of the image of each viewpoint in the spliced image;
按照所述拍摄时间将拼接后的图像进行排序,生成拼接图像序列;Sorting the stitched images according to the shooting time to generate a stitched image sequence;
将所述拼接图像序列按照第一分辨率进行编码,并采用所述排布信息对编码后的所述拼接图像序列进行标记,得到所述从图像帧序列。Encoding the spliced image sequence according to the first resolution, and marking the encoded spliced image sequence by using the arrangement information, to obtain the secondary image frame sequence.
在一实施例中,所述采用所述排布信息对编码后的所述拼接图像序列进行标记,得到所述从图像帧序列的步骤包括:In one embodiment, the step of marking the coded sequence of spliced images by using the arrangement information, and obtaining the sequence of secondary image frames includes:
在编码后的所述拼接图像序列的序列头中插入所述排布信息,得到所述从图像帧序列。The arrangement information is inserted into the sequence header of the coded spliced image sequence to obtain the secondary image frame sequence.
在一实施例中,所述采用所述排布信息对编码后的所述拼接图像序列进行标记,得到所述从图像帧序列的步骤还包括:In one embodiment, the step of marking the coded sequence of spliced images by using the arrangement information, and obtaining the sequence of secondary image frames further includes:
在编码后的所述拼接图像序列中的每一张拼接后的图像插入所述排布信息,得到所述从图像帧序列。Inserting the arrangement information into each stitched image in the encoded stitched image sequence to obtain the slave image frame sequence.
此外,为实现上述目的,本申请还提供了一种解码设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的多视点视频数据处理程序,所述多视点视频数据处理程序被所述处理器执行时实现上述的多视点视频数据处理方法的步骤。In addition, in order to achieve the above object, the present application also provides a decoding device comprising: a memory, a processor, and a multi-view video data processing program stored in the memory and operable on the processor, the multi-view When the video data processing program is executed by the processor, the steps of the above-mentioned multi-viewpoint video data processing method are realized.
此外,为实现上述目的,本申请还提供了一种编码设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的多视点视频数据处理程序,所述多视点视频数据处理程序被所述处理器执行时实现上述的多视点视频数据处理方法的步骤。In addition, in order to achieve the above object, the present application also provides a coding device comprising: a memory, a processor, and a multi-view video data processing program stored in the memory and operable on the processor, the multi-view When the video data processing program is executed by the processor, the steps of the above-mentioned multi-viewpoint video data processing method are realized.
此外,为实现上述目的,本申请还提供了一种存储介质,其上存储有多视点视频数据处理程序,所述多视点视频数据处理程序被处理器执行时实现上述的多视点视频数据处理方法的步骤。In addition, in order to achieve the above purpose, the present application also provides a storage medium on which a multi-viewpoint video data processing program is stored, and when the multi-viewpoint video data processing program is executed by a processor, the above-mentioned multi-viewpoint video data processing method is realized. A step of.
本申请实施例中提供的一种多视点视频数据处理方法、设备及存储介质的技术方案,至少具有如下技术效果或优点:A technical solution for a multi-viewpoint video data processing method, device, and storage medium provided in an embodiment of the present application has at least the following technical effects or advantages:
由于采用了显示设备的视点发生切换时,显示设备将切换的目标视点发送给解码设备,解码设备为显示设备提供由从图像帧序列中的从图像帧中截取的目标视点对应的从视点的图像,使得显示设备显示低分辨率的目标视点的视点画面,在满足画面切换条件时,显示设备向解码设备发送画面切换条件满足指令,解码设备根据画面切换条件满足指令为显示设备提供目标视点对应的主图像帧序列中的图像和/或由从图像帧中截取的目标视点对应的从视点的图像,使得显示设备显示高分辨率的目标视点的视点画面的技术方案,解决了使用空域拼接方法对多个视点的单路视频拼接之后,导致各自由视点应用显示所需的画面分辨率不足,进而导致最终生成的视点画面分辨率不高的技术问题,不仅实现了视点切换时视频显示的零延迟,还可以在视点切换后快速将视频从低分辨恢复至高分辨率,保障了长时间观看视频时的清晰度。When the viewpoint of the display device is switched, the display device sends the switched target viewpoint to the decoding device, and the decoding device provides the display device with the image from the viewpoint corresponding to the target viewpoint intercepted from the image frame in the sequence of image frames , so that the display device displays a low-resolution target view point of view picture. When the picture switching condition is met, the display device sends a picture switching condition satisfaction instruction to the decoding device, and the decoding device provides the display device with the corresponding target viewpoint according to the picture switching condition satisfaction instruction. The image in the main image frame sequence and/or the image of the secondary viewpoint corresponding to the target viewpoint intercepted from the image frame makes the display device display the high-resolution target viewpoint. After the single-channel video splicing of multiple viewpoints, the resolution of the screens required for each viewpoint application display is insufficient, which in turn leads to the technical problem that the resolution of the final generated viewpoint images is not high, and not only achieves zero delay in video display when switching viewpoints , It can also quickly restore the video from low resolution to high resolution after the viewpoint is switched, ensuring the clarity when watching videos for a long time.
附图说明Description of drawings
图1为本申请实施例方案涉及的硬件运行环境的结构示意图;Fig. 1 is a schematic structural diagram of the hardware operating environment involved in the solution of the embodiment of the present application;
图2为本申请多视点视频数据处理方法第一实施例的流程示意图;FIG. 2 is a schematic flow diagram of the first embodiment of the multi-viewpoint video data processing method of the present application;
图3为多视点视频拍摄时的摄像机布置示意图;Fig. 3 is a schematic diagram of camera arrangement during multi-viewpoint video shooting;
图4为本申请多视点视频数据处理方法第二实施例的流程示意图;FIG. 4 is a schematic flow diagram of a second embodiment of the multi-viewpoint video data processing method of the present application;
图5为本申请多视点视频数据处理方法第三实施例的流程示意图;FIG. 5 is a schematic flowchart of a third embodiment of a method for processing multi-viewpoint video data according to the present application;
图6为当前视点为真实视点的示意图;FIG. 6 is a schematic diagram in which the current viewpoint is a real viewpoint;
图7为当前视点为虚拟视点的示意图;FIG. 7 is a schematic diagram in which the current viewpoint is a virtual viewpoint;
图8为本申请多视点视频数据处理方法第四实施例的流程示意图;FIG. 8 is a schematic flowchart of a fourth embodiment of a method for processing multi-viewpoint video data according to the present application;
图9为目标视点为真实视点的示意图;FIG. 9 is a schematic diagram in which the target viewpoint is a real viewpoint;
图10为目标视点为虚拟视点的示意图;FIG. 10 is a schematic diagram in which the target viewpoint is a virtual viewpoint;
图11为本申请多视点视频数据处理方法第五实施例的流程示意图;FIG. 11 is a schematic flowchart of a fifth embodiment of the multi-view video data processing method of the present application;
图12为一种预设排布方式示意图;Fig. 12 is a schematic diagram of a preset arrangement;
图13为另一种预设排布方式示意图。FIG. 13 is a schematic diagram of another preset arrangement.
具体实施方式Detailed ways
为了更好的理解上述技术方案,下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。In order to better understand the above-mentioned technical solutions, exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.
如图1所示,图1为本申请实施例方案涉及的硬件运行环境的结构示意图。As shown in FIG. 1 , FIG. 1 is a schematic structural diagram of a hardware operating environment involved in the solution of the embodiment of the present application.
需要说明的是,图1即可为解码设备或编码设备的硬件运行环境的结构示意图。It should be noted that FIG. 1 is a schematic structural diagram of a hardware operating environment of a decoding device or an encoding device.
如图1所示,该解码设备或编码设备可以包括:处理器1001,例如CPU,存储器1005,用户接口1003,网络接口1004,通信总线1002。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004在一实施例中可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005在一实施例中还可以是独立于前述处理器1001的存储装置。As shown in FIG. 1 , the decoding device or encoding device may include: a processor 1001 , such as a CPU, a memory 1005 , a user interface 1003 , a network interface 1004 , and a communication bus 1002 . Wherein, the communication bus 1002 is used to realize connection and communication between these components. The user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. In an embodiment, the network interface 1004 may include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 can be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory. In an embodiment, the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
本领域技术人员可以理解,图1中示出的解码设备或编码设备结构并不构成对解码设备或编码设备限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the structure of the decoding device or encoding device shown in Figure 1 does not constitute a limitation on the decoding device or encoding device, and may include more or less components than those shown in the illustration, or combine certain components, or Different component arrangements.
如图1所示,作为一种存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及多视点视频数据处理程序。其中,操作系统是管理和控制解码设备或编码设备硬件和软件资源的程序,多视点视频数据处理程序以及其它软件或程序的运行。As shown in FIG. 1 , the memory 1005 as a storage medium may include an operating system, a network communication module, a user interface module, and a multi-viewpoint video data processing program. Among them, the operating system is a program that manages and controls the hardware and software resources of the decoding device or encoding device, the multi-viewpoint video data processing program, and the operation of other software or programs.
在图1所示的解码设备或编码设备中,用户接口1003主要用于连接终端,与终端进行数据通信;网络接口1004主要用于后台服务器,与后台服务器进行数据通信;处理器1001可以用于调用存储器1005中存储的多视点视频数据处理程序。In the decoding device or encoding device shown in Figure 1, the user interface 1003 is mainly used to connect the terminal and perform data communication with the terminal; the network interface 1004 is mainly used for the background server to perform data communication with the background server; the processor 1001 can be used for The multi-viewpoint video data processing program stored in the memory 1005 is called.
在本实施例中,解码设备或编码设备包括:存储器1005、处理器1001及存储在所述存储器1005上并可在所述处理器上运行的多视点视频数据处理程序,其中:In this embodiment, the decoding device or encoding device includes: a memory 1005, a processor 1001, and a multi-view video data processing program stored in the memory 1005 and operable on the processor, wherein:
应用于解码设备,处理器1001调用存储器1005中存储的多视点视频数据处理程序时,执行以下操作:Applied to a decoding device, when the processor 1001 invokes the multi-viewpoint video data processing program stored in the memory 1005, the following operations are performed:
接收到显示设备发送的显示指令时,获取所述显示指令对应的当前视点;When receiving a display instruction sent by a display device, acquiring a current viewpoint corresponding to the display instruction;
发送所述显示设备显示所述当前视点对应的当前视点画面所需的第一视频数据至所述显示设备;所述第一视频数据包括由所述当前视点对应的主传输路径接收到的主图像帧序列中的图像和/或由从传输路径接收到的从图像帧序列中的从图像帧中截取的所述当前视点对应的从视点的图像,所述主图像帧序列中的图像和所述从图像帧中的图像均包括视点画面和/或视点深度图画面,且所述主图像帧序列中的图像的分辨率大于所述从图像帧中的图像的分辨率;Sending the first video data required by the display device to display the current viewpoint picture corresponding to the current viewpoint to the display device; the first video data includes the main image received by the main transmission path corresponding to the current viewpoint The image in the frame sequence and/or the image of the secondary viewpoint corresponding to the current viewpoint intercepted from the secondary image frame in the secondary image frame sequence received from the transmission path, the image in the main image frame sequence and the The images in the slave image frames all include viewpoint pictures and/or viewpoint depth map pictures, and the resolution of the images in the main image frame sequence is greater than the resolution of the images in the slave image frames;
接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点;When receiving the viewpoint switching instruction sent by the display device, acquiring a target viewpoint corresponding to the viewpoint switching instruction;
发送所述显示设备显示所述目标视点对应的第一目标视点画面所需的第二视频数据至所述显示设备;所述第二视频数据包括由所述从图像帧序列中的从图像帧中截取的所述目标视点对应的从视点的图像;Sending the second video data required by the display device to display the first target viewpoint picture corresponding to the target viewpoint to the display device; the second video data includes the slave image frame in the slave image frame sequence An image from a viewpoint corresponding to the intercepted target viewpoint;
接收到所述显示设备发送的画面切换条件满足指令时,发送所述显示设备显示所述目标视点对应的第二目标视点画面所需的第三视频数据至所述显示设备;所述第三视频数据包括由所述目标视点对应的主传输路径接收到的主图像帧序列中的图像和/或由所述从图像 帧序列中的从图像帧中截取的所述目标视点对应的从视点的图像。When receiving the screen switching condition satisfaction instruction sent by the display device, sending the third video data required by the display device to display the second target viewpoint picture corresponding to the target viewpoint to the display device; the third video The data includes an image in the main image frame sequence received by the main transmission path corresponding to the target viewpoint and/or an image of the secondary viewpoint corresponding to the target viewpoint intercepted from a secondary image frame in the secondary image frame sequence .
应用于编码设备,处理器1001调用存储器1005中存储的多视点视频数据处理程序时,执行以下操作:Applied to a coding device, when the processor 1001 invokes the multi-viewpoint video data processing program stored in the memory 1005, the following operations are performed:
获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像,所述图像包括视点画面和视点深度图画面中的至少一个;Acquiring images of various viewpoints taken by each camera, where different cameras take images corresponding to different viewpoints, and the images include at least one of a viewpoint picture and a viewpoint depth map picture;
对各个所述视点的图像进行拼接,并按照拍摄时间和第一分辨率对拼接后的图像进行编码,生成从图像帧序列;Stitching the images of each viewpoint, and encoding the spliced images according to the shooting time and the first resolution, to generate a sequence of secondary image frames;
按照所述拍摄时间和第二分辨率对每个所述视点的图像进行编码,生成每个视点的主图像帧序列,所述主图像帧序列中的图像的分辨率大于拼接后的图像中图像的分辨率;Encode the images of each viewpoint according to the shooting time and the second resolution to generate a main image frame sequence of each viewpoint, the resolution of the images in the main image frame sequence is greater than that of the images in the spliced images resolution;
接收到解码设备发送的视点选定指令时,根据视点选定指令获取所述解码设备选定的视点;When receiving the viewpoint selection instruction sent by the decoding device, acquire the viewpoint selected by the decoding device according to the viewpoint selection instruction;
将所述解码设备选定的视点的主图像帧序列由所述视点的主传输路径传输至所述解码设备,同时将所述从图像帧序列由从传输路径传输至所述解码设备。The main image frame sequence of the viewpoint selected by the decoding device is transmitted to the decoding device through the main transmission path of the viewpoint, and at the same time, the secondary image frame sequence is transmitted to the decoding device through the secondary transmission path.
本申请实施例提供了多视点视频数据处理方法的实施例,需要说明的是,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。The embodiment of the present application provides an embodiment of the method for processing multi-viewpoint video data. It should be noted that although the logic sequence is shown in the flow chart, in some cases, the sequence shown here may be executed in a different order. steps outlined or described.
如图2所示,在本申请的第一实施例中,本申请的多视点视频数据处理方法,应用于解码设备,包括以下步骤:As shown in Figure 2, in the first embodiment of the present application, the multi-viewpoint video data processing method of the present application is applied to a decoding device, including the following steps:
步骤S210:接收到显示设备发送的显示指令时,获取所述显示指令对应的当前视点。Step S210: When a display instruction sent by the display device is received, obtain a current viewpoint corresponding to the display instruction.
在本实施例中,显示设备是指安装有多视点视频播放应用的设备,例如智能手机、平板、智能电视机、电脑等,用户可以通过多视点视频播放应用在显示设备上选择不同的视点观看视频,其中该视频可以是直播视频,例如篮球赛事直播、足球赛事直播等,也可是录播视频,例如羽毛球录播视频等。In this embodiment, a display device refers to a device installed with a multi-viewpoint video playback application, such as a smart phone, a tablet, a smart TV, a computer, etc., and the user can select different viewpoints on the display device to watch through the multi-viewpoint video playback application. Video, where the video can be a live video, such as a basketball match live broadcast, a football match live broadcast, etc., or a recorded video, such as a badminton recorded broadcast video, etc.
本实施例以直播视频为例进行说明,例如,篮球赛事直播视频。在拍摄篮球赛事直播视频时,需在赛事举办场地四周布置若干台摄像机,每台摄像机负责赛事一个角度图像的拍摄,每台摄像机拍摄的一个角度的图像即为一个视点的图像,每个视点对应的图像包括视点画面和视点深度图画面中的至少一个。如图3所示,1-9分别表示P1视点-P9视点,每个视点对应设置了一台摄像机,即P1摄像机-P9摄像机。P1摄像机-P9摄像机为拍摄本篮球赛事的9台摄像机,P1摄像机-P9摄像机分别负责拍摄一个视点的图像,其中,P1摄像机拍摄的图像为P1视点的图像,P2摄像机拍摄的图像为P2视点的图像,以此类推,P9摄像机拍摄的图像为P9视点的图像。In this embodiment, a live video is taken as an example for description, for example, a live video of a basketball match. When shooting a live video of a basketball game, several cameras need to be arranged around the venue where the game is held. Each camera is responsible for shooting an image from an angle of the game. The image from an angle captured by each camera is the image of a viewpoint. The image includes at least one of a viewpoint frame and a viewpoint depth map frame. As shown in FIG. 3 , 1-9 represent viewpoints P1-P9 respectively, and each viewpoint is correspondingly provided with a camera, that is, camera P1-camera P9. Cameras P1-P9 are 9 cameras for shooting this basketball game. Cameras P1-P9 are respectively responsible for shooting images of a viewpoint. Among them, the image captured by the P1 camera is the image of the P1 viewpoint, and the image captured by the P2 camera is the image of the P2 viewpoint. Image, and so on, the image captured by the P9 camera is the image of the P9 viewpoint.
P1摄像机-P9摄像机拍摄的P1视点-P9视点下拍摄的图像之后,编码设备对P1视点-P9视点下拍摄的图像进行编码。编码设备采用预设排布方式将每个视点的图像先拼接在一起,生成拼接后的图像和各个视点的图像在拼接后的图像中的排布信息,该拼接后的图像由P1视点-P9视点在同一时间拍摄的图像拼接而成,可以理解为拼接后的图像是一张大图像,该张大图像分为了9张小图像。其中,排布信息至少包括各个视点的视点标识和各个视点的图像在拼接后的图像中的位置信息,视点标识表示大图像中的各张小图像是哪个视点的图像,例如,其中一张图像的视点标识是P9,则该图像是P9视点对应的图像,位置信息表示每张小图像具体排布在大图像中的哪个位置。生成拼接后的图像和各个视点的图像在拼接后的图像中的排布信息之后,按照拍摄时间将每张拼接后的图像进行排序,生成拼接图像序列。假设,按照拍摄时间的先后,生成了n张拼接后的图像,分别是图像1、图像2、图像3、......、图像n,那么,图像1-图像n一次排序后即为拼接图像序列。进而,再按照预设的第一分辨率对拼接图像序列进行编码,并采用排布信息对编码后的拼接图像序列进行标记,从而得到从图像帧序列。After the P1 camera-P9 camera captures the images captured under the P1 viewpoint-P9 viewpoint, the encoding device encodes the images captured under the P1 viewpoint-P9 viewpoint. The encoding device uses a preset arrangement method to splice the images of each viewpoint first, and generates the spliced image and the arrangement information of the images of each viewpoint in the spliced image. The spliced image is composed of P1 viewpoint-P9 The images taken by the viewpoints at the same time are spliced together. It can be understood that the spliced image is a large image, and the large image is divided into 9 small images. Wherein, the arrangement information includes at least the viewpoint identification of each viewpoint and the position information of the image of each viewpoint in the spliced image, and the viewpoint identifier indicates which viewpoint image each small image in the large image is, for example, one of the images If the viewpoint identifier is P9, the image corresponds to the viewpoint of P9, and the position information indicates where each small image is specifically arranged in the large image. After generating the stitched image and the arrangement information of the images of each viewpoint in the stitched image, each stitched image is sorted according to the shooting time to generate a stitched image sequence. Suppose, according to the order of shooting time, n spliced images are generated, which are image 1, image 2, image 3, ..., image n, then, after sorting image 1-image n once, it is Stitch image sequences. Furthermore, the sequence of spliced images is encoded according to the preset first resolution, and the sequence of encoded spliced images is marked by using the arrangement information, so as to obtain a sequence of secondary image frames.
进一步的,在生成从图像帧序列的同时,还按照拍摄时间和预设的第二分辨率对P1视 点-P9视点中拍摄的各个视点的图像进行单独编码,生成每个视点的主图像帧序列,也就是按照拍摄时间和第二分辨率对P1视点下拍摄的图像进行单独编码,生成P1视点的主图像帧序列,按照拍摄时间和第二分辨率对P2视点下拍摄的图像进行单独编码,生成P2视点的主图像帧序列,以此类推,按照拍摄时间和第二分辨率对P9视点下拍摄的图像进行单独编码,生成P9视点的主图像帧序列,从而生成9个主图像帧序列。其中,每个视点的主图像帧是一张编码后的图像。其中,第一分辨率表示拼接后的图像的总分辨率,拼接后的图像中的各个视点的图像的分辨率小于第二分辨率,也就是拼接后的图像中的各个视点的图像的分辨率小于每个视点的主图像帧序列中的图像的分辨率。Further, while generating the slave image frame sequence, the images of each viewpoint captured from the P1 viewpoint to the P9 viewpoint are also separately encoded according to the shooting time and the preset second resolution, and a master image frame sequence of each viewpoint is generated , that is, according to the shooting time and the second resolution, the images captured under the viewpoint of P1 are separately encoded to generate the main image frame sequence of the viewpoint of P1, and the images captured under the viewpoint of P2 are separately encoded according to the shooting time and the second resolution, Generate the main image frame sequence of the P2 viewpoint, and so on, separately encode the images captured under the P9 viewpoint according to the shooting time and the second resolution, generate the main image frame sequence of the P9 viewpoint, and thus generate 9 main image frame sequences. Wherein, the main image frame of each viewpoint is an encoded image. Wherein, the first resolution represents the total resolution of the stitched image, and the resolution of the images of each viewpoint in the stitched image is smaller than the second resolution, that is, the resolution of the images of each viewpoint in the stitched image Less than the resolution of the images in the main image frame sequence for each viewpoint.
进一步的,编码设备生成从图像帧序列以及每个视点的主图像帧序列之后,编码设备根据接收到的解码设备发送的视点选定指令,获取解码设备选定的视点,然后将解码设备选定的视点的主图像帧序列由该视点的主传输路径传输至解码设备,同时将从图像帧序列由从传输路径传输至解码设备。编码设备将生成的各个视点的主图像帧序列分别通过一个独立的传输路径传输给解码设备,同时也将生成的从图像帧序列通过一个独立的传输路径传输给显示设备。为了便于理解,将传输各个视点的主图像帧序列的独立的传输路径传称为主传输路径,将传输从图像帧序列的独立的传输路径传称为从传输路径。如果有9个视点,即有9条主传输路径和1条从传输路径,每条主传输路径负责传输对应视点的主图像帧序列,例如,P1视点对应的主传输路径传输P1视点的主图像帧序列,从传输路径负责传输从图像帧序列。Further, after the coding device generates the secondary image frame sequence and the main image frame sequence of each viewpoint, the coding device obtains the viewpoint selected by the decoding device according to the received viewpoint selection instruction sent by the decoding device, and then selects the decoding device The main image frame sequence of the viewpoint is transmitted to the decoding device through the main transmission path of the viewpoint, and the secondary image frame sequence is transmitted to the decoding device through the secondary transmission path. The encoding device transmits the generated main image frame sequence of each viewpoint to the decoding device through an independent transmission path, and simultaneously transmits the generated secondary image frame sequence to the display device through an independent transmission path. For ease of understanding, the independent transmission path for transmitting the main image frame sequence of each viewpoint is referred to as the main transmission path, and the independent transmission path for transmitting the slave image frame sequence is referred to as the slave transmission path. If there are 9 viewpoints, that is, there are 9 main transmission paths and 1 secondary transmission path, each main transmission path is responsible for transmitting the main image frame sequence of the corresponding viewpoint, for example, the main transmission path corresponding to the P1 viewpoint transmits the main image of the P1 viewpoint Frame sequence, the slave transmission path is responsible for transmitting the slave image frame sequence.
其中,视点选定指令是解码设备根据显示设备发送的显示指令以及画面切换条件满足指令中的任意一个生成的。例如,显示设备当前需要显示某一视点对应的视点画面,如果该视点是P2视点,则显示设备根据P2视点生成显示指令,将包括P2视点的显示指令发生给解码设备,解码设备根据显示指令获取到P2视点之后,生成包括P2视点的视点选定指令,并将包括P2视点的视点选定指令发送给编码设备,编码设备根据接收到的视点选定指令可以获取到P2视点,P2视点即为解码设备选定的视点,那么编码设备将P2视点的主图像帧序列通过P2视点对应的主传输路径传输给解码设备,同时将从图像帧序列通过从传输路径也传输给解码设备。Wherein, the viewpoint selection instruction is generated by the decoding device according to any one of the display instruction sent by the display device and the screen switching condition satisfaction instruction. For example, the display device currently needs to display a viewpoint picture corresponding to a certain viewpoint. If the viewpoint is a P2 viewpoint, the display device generates a display instruction according to the P2 viewpoint, and sends the display instruction including the P2 viewpoint to the decoding device, and the decoding device acquires it according to the display instruction. After arriving at the P2 viewpoint, generate a viewpoint selection instruction including the P2 viewpoint, and send the viewpoint selection instruction including the P2 viewpoint to the encoding device, and the encoding device can obtain the P2 viewpoint according to the received viewpoint selection instruction, and the P2 viewpoint is For the viewpoint selected by the decoding device, the encoding device transmits the main image frame sequence of the P2 viewpoint to the decoding device through the main transmission path corresponding to the P2 viewpoint, and simultaneously transmits the secondary image frame sequence to the decoding device through the secondary transmission path.
又例如,用户通过显示设备将当前视点P2切换到了目标视点P4,P4视点对应的视点画面为即将显示的视点画面,显示设备确定满足P4视点对应的视点画面切换条件时,生成包括P4视点的画面切换条件满足指令,并将包括P4视点的画面切换条件满足指令发生给解码设备,解码设备根据画面切换条件满足指令获取到P4视点之后,生成包括P4视点的视点选定指令,并将包括P4视点的视点选定指令发送给编码设备,编码设备根据接收到的视点选定指令可以获取到P4视点,P4视点即为解码设备选定的视点,那么编码设备将P4视点的主图像帧序列通过P4视点对应的主传输路径传输给解码设备。For another example, the user switches the current viewpoint P2 to the target viewpoint P4 through the display device, and the viewpoint picture corresponding to the P4 viewpoint is the viewpoint picture to be displayed. When the display device determines that the viewpoint picture switching condition corresponding to the P4 viewpoint is met, a picture including the P4 viewpoint is generated. The switching condition meets the instruction, and the screen switching condition satisfying instruction including the P4 viewpoint is sent to the decoding device. After the decoding device acquires the P4 viewpoint according to the screen switching condition satisfying instruction, it generates a viewpoint selection instruction including the P4 viewpoint, and will include the P4 viewpoint. The viewpoint selection command sent to the encoding device, the encoding device can obtain the P4 viewpoint according to the received viewpoint selection instruction, and the P4 viewpoint is the viewpoint selected by the decoding device, then the encoding device will pass the main image frame sequence of the P4 viewpoint The main transmission path corresponding to the viewpoint is transmitted to the decoding device.
具体的,如果用户在显示设备上打开多视点视频播放应用刚开始观看篮球赛事直播视频时,多视点视频播放应用一般会按照默认的视点作为当前视点,并显示当前视点对应的当前视点画面(篮球赛事视频画面)。在显示设备需要显示当前视点对应的当前视点画面之前,生成包括当前视点的显示指令,并将包括当前视点的显示指令发送给解码设备,解码设备在接收到显示设备发送的显示指令时,获取显示指令对应的当前视点。例如,默认的视点是P2视点,那么P2视点即为当前视点,则显示设备需要显示P2视点对应的视点画面,解码设备接收到显示指令之后,根据显示指令获取到的视点即为P2视点。Specifically, if the user opens the multi-viewpoint video playback application on the display device and starts to watch the live video of a basketball match, the multi-viewpoint video playback application generally takes the default viewpoint as the current viewpoint, and displays the current viewpoint picture corresponding to the current viewpoint (basketball race video screen). Before the display device needs to display the current viewpoint picture corresponding to the current viewpoint, it generates a display instruction including the current viewpoint and sends the display instruction including the current viewpoint to the decoding device. When the decoding device receives the display instruction sent by the display device, it acquires the display instruction The current viewpoint for the command. For example, if the default viewpoint is the P2 viewpoint, then the P2 viewpoint is the current viewpoint, and the display device needs to display the viewpoint picture corresponding to the P2 viewpoint. After the decoding device receives the display instruction, the viewpoint obtained according to the display instruction is the P2 viewpoint.
步骤S220:发送所述显示设备显示所述当前视点对应的当前视点画面所需的第一视频数据至所述显示设备。Step S220: Sending the first video data required by the display device to display the current viewpoint picture corresponding to the current viewpoint to the display device.
解码设备获取到显示指令对应的当前视点之后,生成包括当前视点的视点选定指令,并将包括当前视点的视点选定指令发送给编码设备,编码设备根据接收到的视点选定指令获取当前视点,并将当前视点的主图像帧序列通过当前视点对应的主传输路径传输给解码设备,同时将从图像帧序列通过从传输路径也传输给解码设备,解码设备不仅通过当前视 点对应的主传输路径接收当前视点的主图像帧序列,也通过从传输路径接收从图像帧序列。解码设备接收到当前视点的主图像帧序列以及从图像帧序列之后,从当前视点的主图像帧序列和/或从图像帧序列中获取显示设备显示当前视点对应的当前视点画面所需的第一视频数据,然后将第一视频数据发送给显示设备。After the decoding device acquires the current viewpoint corresponding to the display instruction, it generates a viewpoint selection instruction including the current viewpoint, and sends the viewpoint selection instruction including the current viewpoint to the encoding device, and the encoding device obtains the current viewpoint according to the received viewpoint selection instruction , and transmit the main image frame sequence of the current viewpoint to the decoding device through the main transmission path corresponding to the current viewpoint, and at the same time transmit the secondary image frame sequence to the decoding device through the secondary transmission path. The decoding device not only passes through the main transmission path corresponding to the current viewpoint The main image frame sequence of the current viewpoint is received, and the secondary image frame sequence is also received through the secondary transmission path. After the decoding device receives the main image frame sequence and the secondary image frame sequence of the current viewpoint, it acquires the first image required by the display device to display the current viewpoint picture corresponding to the current viewpoint from the primary image frame sequence and/or the secondary image frame sequence of the current viewpoint. video data, and then send the first video data to the display device.
具体的,第一视频数据包括解码设备由当前视点对应的主传输路径接收到的主图像帧序列中的图像和/或解码设备由从传输路径接收到的从图像帧序列中的从图像帧中截取的当前视点对应的从视点的图像。主图像帧序列中的图像和从图像帧中的图像均包括视点画面和/或视点深度图画面,且主图像帧序列中的图像的分辨率大于从图像帧中的图像的分辨率。其中,从图像帧序列中的从图像帧是指从图像帧序列中的拼接后的图像,包括了各个视点的图像。如果拼接后的图像是由P1视点-P9视点的图像拼接的,那么从图像帧中包括P1视点-P9视点的图像。从图像帧中每个图像对应的视点可以称为从视点,例如,当前视点是P1视点,那么从图像帧中P1视点即为当前视点对应的从视点的图像,进而,解码设备从从图像帧序列中的从图像帧中截取的P1视点的图像。Specifically, the first video data includes the images in the main image frame sequence received by the decoding device from the main transmission path corresponding to the current viewpoint and/or the secondary image frames in the secondary image frame sequence received by the decoding device from the secondary transmission path The intercepted image from the current viewpoint corresponding to the viewpoint. Both the images in the master image frame sequence and the images in the slave image frames include viewpoint frames and/or viewpoint depth map frames, and the resolution of the images in the master image frame sequence is greater than the resolution of the images in the slave image frames. Wherein, the secondary image frame in the secondary image frame sequence refers to the spliced image in the secondary image frame sequence, including images of various viewpoints. If the spliced image is spliced from the images of the P1 viewpoint to the P9 viewpoint, then the secondary image frame includes the images of the P1 viewpoint to the P9 viewpoint. The viewpoint corresponding to each image in the slave image frame can be called the slave viewpoint. For example, if the current viewpoint is the P1 viewpoint, then the P1 viewpoint in the slave image frame is the image from the slave viewpoint corresponding to the current viewpoint. Furthermore, the decoding device starts from the slave image frame The image of the P1 viewpoint intercepted from the image frame in the sequence.
进一步的,显示设备接收到第一视频数据之后,根据第一视频数据显示当前视点对应的当前视点画面。其中,显示设备根据第一视频数据显示当前视点对应的当前视点画面时,用户看到的当前视点画面是较高分辨率的,即看到的篮球赛事直播视频是较高分辨率的。Further, after receiving the first video data, the display device displays the current viewpoint picture corresponding to the current viewpoint according to the first video data. Wherein, when the display device displays the current viewpoint picture corresponding to the current viewpoint according to the first video data, the current viewpoint picture seen by the user is of higher resolution, that is, the live video of the basketball match seen by the user is of higher resolution.
步骤S230:接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点。Step S230: Acquiring a target viewpoint corresponding to the viewpoint switching instruction when receiving the viewpoint switching instruction sent by the display device.
步骤S240:发送所述显示设备显示所述目标视点对应的第一目标视点画面所需的第二视频数据至所述显示设备。Step S240: Send the second video data required by the display device to display the first target viewpoint picture corresponding to the target viewpoint to the display device.
在本实施例中,多视点视频播放应用具备视点切换的功能,用户可在多视点视频播放应用中选择自己需要切换的目标视点,以观看目标视点对应的篮球赛事直播视频。具体的,用户在显示设备的视频播放界面选择需要切换的目标视点,显示设备获取到目标视点之后,确定用户需要进行视点切换,并需要为用户显示目标视点对应的第一目标视点画面所需的第二视频数据,进而显示设备根据目标视点生成视点切换指令,将视点切换指令发送给解码设备,解码设备接收视点切换指令,并从视点切换指令中获取目标视点。例如,当前视点是P1视点,用户选择的目标视点是P2视点,解码设备从视点切换指令中获取到的目标视点即为P2视点。In this embodiment, the multi-viewpoint video playback application has the function of switching viewpoints, and the user can select a target viewpoint to switch to in the multi-viewpoint video playback application to watch the live video of the basketball game corresponding to the target viewpoint. Specifically, the user selects the target viewpoint that needs to be switched on the video playback interface of the display device. After the display device obtains the target viewpoint, it determines that the user needs to perform viewpoint switching, and needs to display the first target viewpoint picture corresponding to the target viewpoint for the user. For the second video data, the display device generates a viewpoint switching instruction according to the target viewpoint, sends the viewpoint switching instruction to the decoding device, and the decoding device receives the viewpoint switching instruction, and obtains the target viewpoint from the viewpoint switching instruction. For example, the current viewpoint is the P1 viewpoint, the target viewpoint selected by the user is the P2 viewpoint, and the target viewpoint acquired by the decoding device from the viewpoint switching instruction is the P2 viewpoint.
解码设备获取到的目标视点之后,由从图像帧序列中的从图像帧中截取目标视点对应的从视点的图像,从而得到所述第二视频数据,并将第二视频数据发送给显示设备,显示设备按照第二视频数据显示目标视点对应的第一目标视点画面。例如,目标视点是P2视点,第二视频数据包括由从图像帧序列中的从图像帧中截取的P2视点对应的从视点的图像,如果P2视点对应的从视点的图像是从图像帧中的图像F2,则显示设备显示图像F2。其中,由于从图像帧中的各个视点的图像的分辨率小于每个视点的主图像帧序列中的图像的分辨率,所以显示设备显示图像F2时,用户看到的P2视点下的篮球赛事直播视频画面的分辨率是低于P1视点下的篮球赛事直播视频画面的分辨率的。After decoding the target viewpoint acquired by the device, the second video data is obtained by intercepting the secondary viewpoint image corresponding to the target viewpoint from the secondary image frame in the sequence of secondary image frames, and sending the second video data to the display device, The display device displays the first target viewpoint picture corresponding to the target viewpoint according to the second video data. For example, the target viewpoint is the P2 viewpoint, and the second video data includes the image from the viewpoint corresponding to the P2 viewpoint intercepted from the image frame in the sequence of image frames, if the image from the viewpoint corresponding to the P2 viewpoint is from the image frame image F2, the display device displays image F2. Wherein, since the resolution of the images of each viewpoint in the secondary image frame is smaller than the resolution of the images in the main image frame sequence of each viewpoint, when the display device displays the image F2, the live basketball game under the viewpoint of P2 seen by the user The resolution of the video picture is lower than the resolution of the live video picture of the basketball game under the P1 viewpoint.
步骤S250:接收到所述显示设备发送的画面切换条件满足指令时,发送所述显示设备显示所述目标视点对应的第二目标视点画面所需的第三视频数据至所述显示设备。Step S250: When receiving the screen switching condition satisfaction instruction sent by the display device, sending the third video data required by the display device to display the second target viewpoint picture corresponding to the target viewpoint to the display device.
解码设备获取到的目标视点之后,并将包括目标视点的视点选定指令发送给编码设备,编码设备根据接收到的视点选定指令获取目标视点,并将目标视点的主图像帧序列通过目标视点对应的主传输路径传输给解码设备,同时将从图像帧序列通过从传输路径也传输给解码设备,解码设备不仅通过目标视点对应的主传输路径接收目标视点的主图像帧序列,也通过从传输路径接收从图像帧序列。解码设备接收到目标视点的主图像帧序列以及从图像帧序列之后,从目标视点的主图像帧序列和/或从图像帧序列中获取显示设备显示目标视点对应的第二目标视点画面所需的第三视频数据,即解码设备事先准备好了第三视频数据。After the target viewpoint is obtained by the decoding device, the viewpoint selection instruction including the target viewpoint is sent to the encoding device, and the encoding device obtains the target viewpoint according to the received viewpoint selection instruction, and passes the main image frame sequence of the target viewpoint through the target viewpoint The corresponding main transmission path is transmitted to the decoding device, and the secondary image frame sequence is also transmitted to the decoding device through the secondary transmission path. The decoding device not only receives the main image frame sequence of the target viewpoint through the main transmission path corresponding to the target viewpoint, but also transmits it through the secondary transmission path path to receive a sequence of frames from an image. After the decoding device receives the main image frame sequence of the target viewpoint and the secondary image frame sequence, it obtains from the main image frame sequence of the target viewpoint and/or the secondary image frame sequence required by the display device to display the second target viewpoint picture corresponding to the target viewpoint. The third video data, that is, the decoding device has prepared the third video data in advance.
在本实施例中,显示设备显示目标视点对应的第一目标视点画面所需的第二视频数据 是由从图像帧序列中的从图像帧中截取的,第一目标视点画面的分辨率比较低,呈现给用户的画面质量比较差,进而在显示目标视点对应的第一目标视点画面一段时间之后,显示设备会恢复显示目标视点对应的较高分辨率的视点画面,进而如果视点不在发生改变,显示设备一直显示较高分辨率的视点画面。In this embodiment, the second video data required by the display device to display the first target viewpoint picture corresponding to the target viewpoint is intercepted from the secondary image frame in the sequence of secondary image frames, and the resolution of the first target viewpoint picture is relatively low , the image quality presented to the user is relatively poor, and after displaying the first target viewpoint image corresponding to the target viewpoint for a period of time, the display device will resume displaying the higher-resolution viewpoint image corresponding to the target viewpoint, and if the viewpoint does not change, The display device always displays a higher resolution viewpoint picture.
显示设备会恢复显示目标视点对应的较高分辨率的视点画面是指显示所述目标视点对应的第二目标视点画面所需的第三视频数据,第三视频数据包括编码端由目标视点对应的主传输路径接收到的主图像帧序列中的图像和/或由从图像帧序列中的从图像帧中截取的目标视点对应的从视点的图像。其中,第三视频数据中由从图像帧序列中的从图像帧中截取的目标视点对应的从视点的图像与第二视频数据不同,第二视频数据是解码设备在第三视频数据之前发送给显示设备的,第三视频数据是解码设备根据显示设备发送的画面切换条件满足指令时,发送给显示设备的。The display device will resume displaying the higher-resolution viewpoint picture corresponding to the target viewpoint, which refers to the third video data required for displaying the second target viewpoint picture corresponding to the target viewpoint, and the third video data includes the video data corresponding to the target viewpoint at the encoding end. The image in the main image frame sequence received by the main transmission path and/or the image of the secondary viewpoint corresponding to the target viewpoint intercepted from the secondary image frame in the secondary image frame sequence. Wherein, the image from the viewpoint corresponding to the target viewpoint intercepted from the image frame in the sequence of image frames in the third video data is different from the second video data, and the second video data is sent to the third video data by the decoding device before the third video data For the display device, the third video data is sent by the decoding device to the display device when the screen switching condition sent by the display device satisfies the instruction.
显示设备会恢复显示目标视点对应的较高分辨率的视点画面的时机是根据画面切换条件判断的,画面切换条件可以理解为显示设备已经将显示目标视点对应的第一目标视点画面所需的第二视频数据显示完,下一时间继续显示目标视点对应的第二目标视点画面所需的第三视频数据,即到达显示目标视点对应的第二目标视点画面的时间。例如,目标视点对应的第一目标视点画面所需的第二视频数据显示完的时间是10分00秒,那么10分01秒时就要与目标视点对应的第一目标视点画面进行衔接,即在10分01秒时显示目标视点对应的第二目标视点画面所需的第三视频数据。The timing when the display device resumes displaying the higher-resolution viewpoint picture corresponding to the target viewpoint is judged according to the screen switching condition. The screen switching condition can be understood as the display device has already displayed the first target viewpoint picture corresponding to the target viewpoint After the second video data is displayed, continue to display the third video data required for the second target viewpoint picture corresponding to the target viewpoint at the next time, that is, the time to display the second target viewpoint picture corresponding to the target viewpoint. For example, the time required for the second video data to be displayed for the first target viewpoint picture corresponding to the target viewpoint is 10 minutes and 00 seconds, then at 10 minutes and 01 seconds, it will be connected with the first target viewpoint picture corresponding to the target viewpoint, that is At 10 minutes and 01 seconds, the third video data required by the second target viewpoint picture corresponding to the target viewpoint is displayed.
进一步的,显示设备判定满足画面切换条件时,生成画面切换条件满足指令,将画面切换条件满足指令发送给解码设备,解码设备根据画面切换条件满足指令获取第三视频数据,将第三视频数据发送给显示设备,显示设备根据的第三视频数据显示目标视点对应的第二目标视点画面,即显示设备从显示较低分辨率的目标视点对应的第一目标视点画面恢复至显示较高分辨率的目标视点对应的第二目标视点画面,用户又可以看到较高分辨率的视点画面,即又可以看到较高分辨率的篮球赛事直播视频。进而如果视点不在发生改变,显示设备一直显示较高分辨率的视点画面。Further, when the display device determines that the screen switching condition is satisfied, it generates a screen switching condition satisfaction instruction, and sends the screen switching condition satisfaction instruction to the decoding device, and the decoding device acquires the third video data according to the screen switching condition satisfaction instruction, and sends the third video data to For the display device, the display device displays the second target viewpoint picture corresponding to the target viewpoint according to the third video data, that is, the display device restores from displaying the first target viewpoint picture corresponding to the lower resolution target viewpoint to displaying the higher resolution For the second target viewpoint picture corresponding to the target viewpoint, the user can see a higher resolution viewpoint picture, that is, a higher resolution live video of a basketball game. Furthermore, if the viewpoint does not change, the display device always displays a viewpoint picture with a higher resolution.
基于上述步骤S210-步骤S250,本实施例按照下述例子对显示设备进行说明,具体如下:Based on the above step S210-step S250, this embodiment describes the display device according to the following example, specifically as follows:
例如,用户刚打开多视点视频播放应用观看篮球赛事直播视频时,默认的当前视点是P1视点,那么,显示设备按照解码设备发送的第一视频数据显示P1视点对应的视点画面,此时P1视点对应的视点画面的分辨率比较高,用户看到的是较高分辨率的篮球赛事直播视频。如果用户将P1视点切换为P2视点之后,显示设备按照解码设备发送的第二视频数据显示P2视点对应的第一目标视点画面,此时P2视点对应的第一目标视点画面的分辨率比较低,用户看到的是较低分辨率的篮球赛事直播视频。在第二视频数据显示完之后,显示设备继续按照解码设备发送的第三视频数据显示P2视点对应的第二目标视点画面,此时P2视点对应的第二目标视点画面的分辨率比较高,用户看到的是较高分辨率的篮球赛事直播视频,也就是用户看到的篮球赛事直播视频的从低分辨率又恢复到了高分辨率,后面如果P2视点不发生切换,则显示设备一直为用户显示较高分辨率的篮球赛事直播视频。For example, when the user just opens the multi-viewpoint video playback application to watch the live video of a basketball game, the default current viewpoint is the P1 viewpoint, then the display device displays the viewpoint picture corresponding to the P1 viewpoint according to the first video data sent by the decoding device. At this time, the P1 viewpoint The resolution of the corresponding viewpoint picture is relatively high, and what the user sees is a relatively high-resolution live video of the basketball match. If the user switches the P1 viewpoint to the P2 viewpoint, the display device displays the first target viewpoint picture corresponding to the P2 viewpoint according to the second video data sent by the decoding device. At this time, the resolution of the first target viewpoint picture corresponding to the P2 viewpoint is relatively low. What the user sees is a lower resolution live video of a basketball game. After the second video data is displayed, the display device continues to display the second target viewpoint picture corresponding to the P2 viewpoint according to the third video data sent by the decoding device. At this time, the resolution of the second target viewpoint picture corresponding to the P2 viewpoint is relatively high, and the user What you see is a higher resolution live video of a basketball game, that is, the live video of a basketball game that the user sees is restored from low resolution to high resolution. If the P2 viewpoint does not switch later, the display device will always be the user’s Displays higher resolution live video of basketball games.
本实施例根据上述技术方案,在观看直播视频或录播视频时,不仅实现了视点切换时视频显示的零延迟,还可以在视点切换后快速将视频从低分辨恢复至高分辨率,保障了长时间观看视频时的清晰度。According to the above technical solution, this embodiment not only realizes zero-delay video display when switching viewpoints, but also quickly restores the video from low resolution to high resolution after switching viewpoints when watching live videos or recorded videos, ensuring long-term Clarity when watching video.
如图4所示,在本申请的第二实施例中,基于第一实施例,步骤S210之前还包括以下步骤:As shown in Figure 4, in the second embodiment of the present application, based on the first embodiment, the following steps are also included before step S210:
步骤S110:获取所述当前视点对应的从视点的视点标识以及所述从图像帧序列的排布信息。Step S110: Obtain the viewpoint identifier of the secondary viewpoint corresponding to the current viewpoint and the arrangement information of the secondary image frame sequence.
步骤S120:根据所述排布信息以及所述视点标识确定所述当前视点对应的从视点的图像在所述从图像帧中的位置信息。Step S120: Determine the position information of the secondary viewpoint image corresponding to the current viewpoint in the secondary image frame according to the arrangement information and the viewpoint identifier.
步骤S130:由所述从图像帧序列中的从图像帧中截取所述位置信息对应的图像。Step S130: Extracting an image corresponding to the location information from a secondary image frame in the sequence of secondary image frames.
编码设备生成从图像帧序列之前,编码设备先按照预设排布方式对各个视点的图像进行拼接,生成拼接后的图像以及各个视点的图像在拼接后的图像中的排布信息,然后按照拍摄时间和预设的第一分辨率对拼接图像序列进行编码,并采用排布信息对编码后的拼接图像序列进行标记,从而得到从图像帧序列。其中,采用排布信息对编码后的拼接图像序列进行标记,得到从图像帧序列包括:可以在编码后的拼接图像序列的序列头中插入排布信息,得到从图像帧序列,也可以是在编码后的拼接图像序列中的每一张拼接后的图像插入排布信息,得到从图像帧序列。将排布信息插入拼接图像序列的序列头中,解码设备可以只读取从图像帧序列的序列头中的排布信息就可以获取到从图像帧序列中每一张从图像帧中各个从视点的图像的位置信息,将排布信息插入每一张拼接后的图像,解码设备需要读取从图像帧序列中每一张从图像帧中的排布信息,才可以得到每一张从图像帧中各个从视点的图像的位置信息。排布信息至少包括各个视点的视点标识和各个视点的图像在拼接后的图像中的位置信息。Before the encoding device generates the slave image frame sequence, the encoding device first stitches the images of each viewpoint according to the preset arrangement method, generates the stitched image and the arrangement information of the images of each viewpoint in the stitched image, and then according to the shooting The time and the preset first resolution are used to encode the spliced image sequence, and the arrangement information is used to mark the encoded spliced image sequence, so as to obtain the slave image frame sequence. Wherein, using arrangement information to mark the coded spliced image sequence, obtaining the slave image frame sequence includes: inserting arrangement information into the sequence header of the coded spliced image sequence to obtain the slave image frame sequence, or in Arrangement information is inserted into each stitched image in the encoded stitched image sequence to obtain a slave image frame sequence. Insert the arrangement information into the sequence header of the spliced image sequence, and the decoding device can only read the arrangement information in the sequence header of the slave image frame sequence to obtain each slave viewpoint in each slave image frame in the slave image frame sequence The position information of the image, and insert the layout information into each spliced image, the decoding device needs to read the layout information of each slave image frame in the sequence of image frames to get each slave image frame The position information of the image from each viewpoint in . The arrangement information includes at least a viewpoint identifier of each viewpoint and position information of images of each viewpoint in the spliced image.
当各个视点的图像包括视点画面时,排布信息的格式是{x,y,w,h,view_id},x,y为视点画面左上角像素在拼接后的图像中的坐标,w,h为视点画面的宽和高,view_id为图像或视点画面对应的视点标识,其中,x,y,w,h表示位置信息。当各个视点的图像包括视点画面和视点深度图画面时,排布信息的格式是{x,y,w,h,view_id,is_depth},x,y为视点画面或视点深度图画面左上角像素在拼接后的图像中的坐标,w,h为视点画面或视点深度图画面的宽和高,view_id为视点画面或视点深度图画面对应的视点标识,is_depth标记画面是否是视点深度图画面。其中,x,y,w,h,表示位置信息。拼接后的图像中的各个视点的图像和从图像帧中各个视点的图的排布方式是相同的,所以拼接后的图像和从图像帧相同,二者中的排布信息也是相同的。When the image of each viewpoint includes a viewpoint picture, the format of the layout information is {x, y, w, h, view_id}, where x, y are the coordinates of the pixel in the upper left corner of the viewpoint picture in the spliced image, and w, h are The width and height of the viewpoint picture, view_id is the viewpoint identifier corresponding to the image or viewpoint picture, where x, y, w, and h represent the location information. When the images of each viewpoint include viewpoint images and viewpoint depth map images, the format of the layout information is {x, y, w, h, view_id, is_depth}, where x, y are the pixels in the upper left corner of the viewpoint images or viewpoint depth map images The coordinates in the spliced image, w, h are the width and height of the viewpoint picture or the viewpoint depth map picture, view_id is the viewpoint ID corresponding to the viewpoint picture or the viewpoint depth map picture, and is_depth marks whether the picture is a viewpoint depth map picture. Among them, x, y, w, h represent position information. The images of each viewpoint in the spliced image are arranged in the same manner as the images of each viewpoint in the secondary image frame, so the spliced image is the same as the secondary image frame, and the arrangement information in the two is also the same.
在本实施例中,解码设备获取当前视点对应的从视点的图像时,获取当前视点对应的从视点的视点标识以及从图像帧序列的排布信息。例如,当前视点是P2视点,则当前视点对应的视点标识为P2。获取到当前视点对应的视点标识和排布信息之后,获取排布信息中的与当前视点对应的视点标识相同的视点标识,然后根据从排布信息中获取的视点标识,从从图像帧中确定出当前视点对应的从视点的图像在从图像帧中的位置信息。进而,根据确定出的位置信息,从从图像帧序列中的从图像帧中截取位置信息对应的图像。如果,当前视点是P2视点,从图像帧中视点标识为P2的视点就是P2视点对应的从视点。In this embodiment, when the decoding device acquires the image of the secondary viewpoint corresponding to the current viewpoint, it obtains the viewpoint identifier of the secondary viewpoint corresponding to the current viewpoint and the arrangement information of the secondary image frame sequence. For example, if the current viewpoint is a P2 viewpoint, then the viewpoint identifier corresponding to the current viewpoint is P2. After obtaining the viewpoint identifier and arrangement information corresponding to the current viewpoint, obtain the viewpoint identifier that is the same as the viewpoint identifier corresponding to the current viewpoint in the arrangement information, and then determine from the image frame according to the viewpoint identifier obtained from the arrangement information Obtain the position information of the image of the secondary viewpoint corresponding to the current viewpoint in the secondary image frame. Furthermore, according to the determined position information, an image corresponding to the position information is intercepted from a sub image frame in the sub image frame sequence. If the current viewpoint is the P2 viewpoint, the viewpoint identified as P2 in the secondary image frame is the secondary viewpoint corresponding to the P2 viewpoint.
例如,当前视点对应的视点标识为P2,根据当前视点对应的视点标识确定出排布信息中的视点标识也为P2的视点之后,根据排布信息中的视点标识P2确定出的当前视点对应的从视点的图像,并按照排布信息从从图像帧中的截取出从图像帧中的P2视点的图像。For example, the viewpoint identifier corresponding to the current viewpoint is P2, and after determining the viewpoint whose viewpoint identifier in the arrangement information is also P2 according to the viewpoint identifier corresponding to the current viewpoint, the viewpoint corresponding to the current viewpoint determined according to the viewpoint identifier P2 in the arrangement information From the image of the viewpoint, and according to the arrangement information, the image of the P2 viewpoint in the secondary image frame is intercepted from the secondary image frame.
如图5所示,在本申请的第三实施例中,基于第一实施例,步骤S220包括以下步骤:As shown in FIG. 5, in the third embodiment of the present application, based on the first embodiment, step S220 includes the following steps:
步骤S221:判断所述当前视点是否为虚拟视点,如果是,则执行步骤S223;如果否,则执行步骤S222。Step S221: Determine whether the current viewpoint is a virtual viewpoint, if yes, execute step S223; if not, execute step S222.
步骤S222:发送由所述当前视点对应的主传输路径接收到的主图像帧序列中的图像至所述显示设备。Step S222: Send the images in the main image frame sequence received by the main transmission path corresponding to the current viewpoint to the display device.
步骤S223:发送由所述当前视点对应的主传输路径接收到的主图像帧序列中的图像和由从传输路径接收到的从图像帧序列中的从图像帧中截取的所述当前视点对应的从视点的图像至所述显示设备。Step S223: Send the images in the main image frame sequence received by the main transmission path corresponding to the current viewpoint and the images corresponding to the current viewpoint intercepted from the secondary image frames in the secondary image frame sequence received by the secondary transmission path The image from the viewpoint to the display device.
在本实施例中,如果用户在显示设备上打开多视点视频播放应用刚开始观看篮球赛事直播视频时,多视点视频播放应用的当前视点(默认的视点)可能为虚拟视点,也可能不为虚拟视点,不为虚拟视点就是真实视点。真实视点是指真实存在的视点,每台相机所所对应的视点就是真实视,例如,P1摄像机-P9摄像机所对应的P1视点-P9视点均是真实视点。虚拟视点视点是指实际不存在的视点,即相邻两个真实视点之间的视点,例如,P1视点与P2视点之间的视点就是虚拟视点。如果当前视点为真实视点,则显示设备直接显示当前视点的主图像帧序列中的图像,如果当前视点为虚拟视点,则需要根据从图像帧中当前 视点对应的从视点的图像合成当前视点的当前视点画面,然后显示与当前视点距离最近的真实视点的主图像帧序列中的图像,其中,所述当前视点对应的从视点包括所述当前视点相邻的从视点;当前视点对应的从视点的图像包括视点画面和视点深度图画面,当前视点的当前视点画面需要根据当前视点对应的从视点的图像中的视点画面和视点深度图画面合成。In this embodiment, if the user opens the multi-viewpoint video playback application on the display device and starts to watch the live video of the basketball match, the current viewpoint (the default viewpoint) of the multi-viewpoint video playback application may or may not be a virtual viewpoint. The point of view is either a virtual point of view or a real point of view. The real viewpoint refers to a real viewpoint, and the viewpoint corresponding to each camera is the true viewpoint. For example, the P1 viewpoint-P9 viewpoint corresponding to the P1 camera-P9 camera are all real viewpoints. A virtual viewpoint refers to a viewpoint that does not actually exist, that is, a viewpoint between two adjacent real viewpoints, for example, a viewpoint between a P1 viewpoint and a P2 viewpoint is a virtual viewpoint. If the current viewpoint is a real viewpoint, the display device directly displays the images in the main image frame sequence of the current viewpoint; if the current viewpoint is a virtual viewpoint, it needs to synthesize the current Viewpoint picture, and then display the image in the main image frame sequence of the real viewpoint closest to the current viewpoint, wherein, the secondary viewpoint corresponding to the current viewpoint includes the secondary viewpoint adjacent to the current viewpoint; the secondary viewpoint corresponding to the current viewpoint The image includes a viewpoint picture and a viewpoint depth map picture, and the current viewpoint picture of the current viewpoint needs to be synthesized according to the viewpoint picture and the viewpoint depth map picture in the image of the secondary viewpoint corresponding to the current viewpoint.
具体的,解码设备接收到显示设备发送的显示指令之后,根据显示指令获取当前视点,并判断当前视点是否为虚拟视点,如果当前视点为真实视点,则将由当前视点对应的主传输路径接收到的主图像帧序列中的图像作为第一视频数据发送给显示设备,显示设备按照第一视频数据显示当前视点的当前视点画面。如果当前视点为虚拟视点,则将由当前视点对应的主传输路径接收到的主图像帧序列中的图像以及由从传输路径接收到的从图像帧序列中的从图像帧中截取的当前视点对应的从视点的图像作为第一视频数据发送给显示设备,显示设备按照当前视点对应的从视点的图像中的视点画面和视点深度图画面合成当前视点对应的当前视点画面,然后显示当前视点对应的当前视点画面,即显示当前视点的主图像帧序列中的图像。其中,显示的当前视点的主图像帧序列中的图像是与当前视点距离最近的真实视点的主图像帧序列中的图像,显示设备按照第一视频数据显示当前视点对应的当前视点画面时,用户看到的当前视点画面是较高分辨率的。Specifically, after the decoding device receives the display instruction sent by the display device, it obtains the current viewpoint according to the display instruction, and judges whether the current viewpoint is a virtual viewpoint. The images in the main image frame sequence are sent to the display device as the first video data, and the display device displays the current viewpoint picture of the current viewpoint according to the first video data. If the current viewpoint is a virtual viewpoint, the image in the main image frame sequence received by the main transmission path corresponding to the current viewpoint and the image corresponding to the current viewpoint intercepted from the secondary image frame sequence in the secondary image frame sequence received by the secondary transmission path The image from the viewpoint is sent to the display device as the first video data, and the display device synthesizes the current viewpoint image corresponding to the current viewpoint according to the viewpoint image and the viewpoint depth map image in the image corresponding to the current viewpoint, and then displays the current image corresponding to the current viewpoint. Viewpoint picture, that is, an image in the main image frame sequence displaying the current viewpoint. Wherein, the displayed images in the main image frame sequence of the current viewpoint are the images in the main image frame sequence of the real viewpoint closest to the current viewpoint, and when the display device displays the current viewpoint picture corresponding to the current viewpoint according to the first video data, the user The current viewpoint picture seen is of higher resolution.
例如,如图6所示,1-9分别表示P1视点-P9视点,视点A表示默认视点,即当前视点,图6中的视点A落在了P5视点上,即视点A也是P5视点,其也是真实视点。那么,解码设备将由P5视点对应的主传输路径接收到的主图像帧序列中的图像作为第一视频数据发送给显示设备,显示设备按照第一视频数据显示P5视点的主图像帧序列中的图像。如图7所示,视点A表示默认视点,即当前视点,图7中的A落在了P5视点与P6视点之间,即视点A为虚拟视点。那么,解码设备将由P5视点对应的主传输路径接收到的主图像帧序列中的图像以及由从传输路径接收到的从图像帧序列中的从图像帧中截取的P5视点和P6视点的图像作为第一视频数据发送给显示设备,显示设备按照P5视点和P6视点的图像中的视点画面和视点深度图画面合成当前视点的当前视点画面,然后显示P5视点的主图像帧序列中的图像。For example, as shown in Figure 6, 1-9 represent viewpoints P1-P9 respectively, and viewpoint A represents the default viewpoint, that is, the current viewpoint, and viewpoint A in Fig. 6 falls on viewpoint P5, that is, viewpoint A is also viewpoint P5, and It is also a real point of view. Then, the decoding device sends the images in the main image frame sequence received by the main transmission path corresponding to the P5 viewpoint as the first video data to the display device, and the display device displays the images in the main image frame sequence of the P5 viewpoint according to the first video data . As shown in FIG. 7 , viewpoint A represents the default viewpoint, that is, the current viewpoint, and A in FIG. 7 falls between viewpoints P5 and P6, that is, viewpoint A is a virtual viewpoint. Then, the decoding device uses the images in the main image frame sequence received by the main transmission path corresponding to the P5 viewpoint and the images of the P5 viewpoint and P6 viewpoint intercepted from the image frames in the secondary image frame sequence received by the secondary transmission path as The first video data is sent to the display device, and the display device synthesizes the current view point picture of the current view point according to the view point pictures and the view point depth map pictures in the images of the P5 view point and the P6 view point, and then displays the images in the main image frame sequence of the P5 view point.
如图8所示,在本申请的第四实施例中,基于第一实施例,步骤S240包括以下步骤:As shown in FIG. 8, in the fourth embodiment of the present application, based on the first embodiment, step S240 includes the following steps:
步骤S241:判断所述目标视点是否为虚拟视点,如果是,则执行步骤S243;如果否,则执行步骤S242。Step S241: Determine whether the target viewpoint is a virtual viewpoint, if yes, execute step S243; if not, execute step S242.
步骤S242:发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相同的从视点的图像至所述显示设备。Step S242: Send the image of the secondary viewpoint identical to the target viewpoint intercepted from the secondary image frame in the sequence of secondary image frames to the display device.
步骤S243:发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相邻的从视点的图像至所述显示设备。Step S243: Sending images of secondary viewpoints adjacent to the target viewpoint intercepted from secondary image frames in the sequence of secondary image frames to the display device.
在本实施例中,用户在显示设备的视频播放界面选择需要切换的目标视点之后,用户所选的目标视点可能为虚拟视点,也可能不为虚拟视点,不为虚拟视点就是真实视点。如果目标视点为真实视点,则显示设备直接显示从图像帧中的与目标视点相同的从视点的图像,如果目标视点为虚拟视点,则需要根据从图像帧中与目标视点相邻的从视点的图像合成目标视点的视点画面,然后显示与目标视点距离最近的从视点的图像,其中,与目标视点相邻的从视点的图像包括视点画面和视点深度图画面,目标视点的视点画面需要根据与目标视点相邻的从视点的图像中的视点画面和视点深度图画面合成。In this embodiment, after the user selects a target viewpoint to be switched on the video playback interface of the display device, the target viewpoint selected by the user may or may not be a virtual viewpoint. If it is not a virtual viewpoint, it is a real viewpoint. If the target viewpoint is a real viewpoint, the display device directly displays the image from the same viewpoint as the target viewpoint in the secondary image frame; if the target viewpoint is a virtual viewpoint, it needs to be based on the Image synthesis of the viewpoint picture of the target viewpoint, and then displaying the image from the viewpoint closest to the target viewpoint. The viewpoint frame and the viewpoint depth map frame in the image of the secondary viewpoint adjacent to the target viewpoint are synthesized.
具体的,解码设备接收到显示设备发送的视点切换指令之后,根据视点切换指令获取目标视点,并判断目标视点是否为虚拟视点,如果目标视点为真实视点,则将由从图像帧序列中的从图像帧中截取的与目标视点相同的从视点的图像作为第二视频数据发送给显示设备,显示设备按照第二视频数据显示与目标视点相同的从视点的图像。如果目标视点为虚拟视点,则由从图像帧序列中的从图像帧中截取的与目标视点相邻的从视点的图像作为第二视频数据发送给显示设备,显示设备根据与目标视点相邻的从视点的图像中的视点画 面和视点深度图画面合成目标视点对应的第一目标视点画面,然后按照第二视频数据显示与目标视点距离最近的从视点的图像。值得注意的是,显示设备显示目标视点对应的第一目标视点画面的分辨率是低于当前视点对应的当前视点画面的分辨率的,即用户看到的视点画面的分辨率是较低的。Specifically, after the decoding device receives the viewpoint switching instruction sent by the display device, it obtains the target viewpoint according to the viewpoint switching instruction, and judges whether the target viewpoint is a virtual viewpoint. If the target viewpoint is a real viewpoint, the secondary image in the sequence of secondary image frames The image of the secondary viewpoint identical to the target viewpoint intercepted in the frame is sent to the display device as the second video data, and the display device displays the image of the same secondary viewpoint as the target viewpoint according to the second video data. If the target point of view is a virtual point of view, then the image from the point of view adjacent to the target point of view intercepted from the frame of the image frame sequence is sent to the display device as the second video data, and the display device is based on the image adjacent to the target point of view Synthesizing the first target viewpoint frame corresponding to the target viewpoint from the viewpoint frame and the viewpoint depth map frame in the image of the viewpoint, and then displaying the image from the viewpoint closest to the target viewpoint according to the second video data. It should be noted that the display device displays that the resolution of the first target viewpoint picture corresponding to the target viewpoint is lower than the resolution of the current viewpoint picture corresponding to the current viewpoint, that is, the resolution of the viewpoint picture seen by the user is relatively low.
例如,如图9所示,视点B表示目标视点,图9中的视点B落在了P6视点上,视点B就是P6视点,其也是真实视点。那么,解码设备将由从图像帧序列中的从图像帧中截取的P6视点的图像作为第二视频数据发送给显示设备,显示设备按照第二视频数据显示解码设备由从图像帧序列中的从图像帧中截取的P6视点的图像。如图10所示,视点C表示目标视点,图10中的视点C落在了P6视点与P7视点之间,即视点C为虚拟视点。那么,解码设备将由从图像帧序列中的从图像帧中截取的P6视点和P7视点的图像作为第二视频数据发送给显示设备,显示设备根据P6视点和P7视点的图像中的视点画面和视点深度图画面合成视点C对应的第一目标视点画面,其中,P7视点距离视点C最近,则显示设备按照第二视频数据显示P7视点的图像。For example, as shown in FIG. 9 , viewpoint B represents a target viewpoint, and viewpoint B in FIG. 9 falls on viewpoint P6, and viewpoint B is viewpoint P6, which is also a real viewpoint. Then, the decoding device sends the image of the P6 viewpoint intercepted from the image frame in the sequence of image frames to the display device as the second video data, and the display device displays the image from the image frame in the sequence of image frames obtained by the decoding device according to the second video data. The image of the P6 viewpoint captured in the frame. As shown in FIG. 10 , viewpoint C represents the target viewpoint, and viewpoint C in FIG. 10 falls between viewpoints P6 and P7, that is, viewpoint C is a virtual viewpoint. Then, the decoding device sends the images of the P6 viewpoint and the P7 viewpoint intercepted from the image frames in the sequence of image frames as the second video data to the display device, and the display device transmits the images according to the viewpoint pictures and viewpoints in the images of the P6 viewpoint and the P7 viewpoint The depth map picture synthesizes the first target viewpoint picture corresponding to the viewpoint C, wherein the viewpoint P7 is closest to the viewpoint C, and the display device displays the image of the viewpoint P7 according to the second video data.
进一步的,基于第四实施例,所述目标视点不为虚拟视点,发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相同的从视点的图像至所述显示设备的步骤之后,所述接收到所述显示设备发送的画面切换条件满足指令时,发送所述显示设备显示所述目标视点对应的第二目标视点画面所需的第三视频数据至所述显示设备步骤包括:Further, based on the fourth embodiment, the target viewpoint is not a virtual viewpoint, and an image of a slave viewpoint identical to the target viewpoint intercepted from a slave image frame in the sequence of slave image frames is sent to the display device After the step, when receiving the screen switching condition satisfaction instruction sent by the display device, sending the third video data required by the display device to display the second target viewpoint picture corresponding to the target viewpoint to the display device Steps include:
接收到所述显示设备发送的画面切换条件满足指令时,发送由所述目标视点对应的主传输路径接收到的主图像帧序列中的图像至所述显示设备。When the screen switching condition satisfaction instruction sent by the display device is received, the images in the main image frame sequence received by the main transmission path corresponding to the target viewpoint are sent to the display device.
具体的,如果目标视点为真实视点,解码设备将由目标视点对应的主传输路径接收到的主图像帧序列中的图像作为第三视频数据发送给显示设备,显示设备按照第三视频数据显示目标视点对应的第二目标视点画面。其中,显示设备显示目标视点对应的第二目标视点画面时,用户看到的视点画面的分辨率是较高的,也就是显示设备从显示较低分辨率的目标视点对应的第一目标视点画面恢复至显示较高分辨率的目标视点对应的第二目标视点画面,用户又可以看到较高分辨率的视点画面。Specifically, if the target viewpoint is a real viewpoint, the decoding device sends the image in the main image frame sequence received by the main transmission path corresponding to the target viewpoint as the third video data to the display device, and the display device displays the target viewpoint according to the third video data The corresponding second target viewpoint picture. Wherein, when the display device displays the second target viewpoint picture corresponding to the target viewpoint, the resolution of the viewpoint picture seen by the user is relatively high, that is, the display device displays the first target viewpoint picture corresponding to the lower resolution target viewpoint. Return to displaying the second target viewpoint picture corresponding to the higher resolution target viewpoint, and the user can see the higher resolution viewpoint picture again.
例如,如图9所示,视点B表示目标视点,图9中的视点B落在了P6视点上,视点B就是P6视点,其也是真实视点。那么,解码设备将由P6视点对应的主图像帧序列中的图像作为第三视频数据发送给显示设备,显示设备按照第三视频数据显示P6视点对应的第二目标视点画面,即显示P6视点对应的主图像帧序列中的图像,此时显示设备显示的P6视点对应的第二目标视点画面的分辨率是较高的。For example, as shown in FIG. 9 , viewpoint B represents a target viewpoint, and viewpoint B in FIG. 9 falls on viewpoint P6, and viewpoint B is viewpoint P6, which is also a real viewpoint. Then, the decoding device sends the image in the main image frame sequence corresponding to the P6 viewpoint to the display device as the third video data, and the display device displays the second target viewpoint picture corresponding to the P6 viewpoint according to the third video data, that is, displays the image corresponding to the P6 viewpoint. For the images in the main image frame sequence, the resolution of the second target viewpoint picture corresponding to the P6 viewpoint displayed by the display device at this time is relatively high.
进一步的,基于第四实施例,所述目标视点为虚拟视点,发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相邻的从视点的图像至所述显示设备的步骤之后,所述接收到所述显示设备发送的画面切换条件满足指令时,发送所述显示设备显示所述目标视点对应的第二目标视点画面所需的第三视频数据至所述显示设备步骤还包括:Further, based on the fourth embodiment, the target viewpoint is a virtual viewpoint, and an image of a slave viewpoint adjacent to the target viewpoint intercepted from a slave image frame in the slave image frame sequence is sent to the display device After the step, when receiving the screen switching condition satisfaction instruction sent by the display device, sending the third video data required by the display device to display the second target viewpoint picture corresponding to the target viewpoint to the display device Steps also include:
接收到所述显示设备发送的画面切换条件满足指令时,发送由所述目标视点对应的主传输路径接收到的主图像帧序列中的图像和由所述从图像帧序列中的从图像帧中截取的与所述目标视点相邻的从视点的图像至所述显示设备至所述显示设备。When the screen switching condition satisfaction instruction sent by the display device is received, the image in the main image frame sequence received by the main transmission path corresponding to the target viewpoint and the image in the secondary image frame sequence received by the secondary image frame sequence are sent The intercepted images from viewpoints adjacent to the target viewpoint are sent to the display device to the display device.
具体的,如果目标视点为虚拟视点,解码设备将由目标视点对应的主传输路径接收到的主图像帧序列中的图像和由从图像帧序列中的从图像帧中截取的与目标视点相邻的从视点的图像作为第三视频数据发送给显示设备,显示设备按照第三视频数据显示目标视点对应的第二目标视点画面。其中,显示设备显示目标视点对应的第二目标视点画面时,用户看到的视点画面的分辨率是较高的,也就是显示设备从显示较低分辨率的目标视点对应的第一目标视点画面恢复至显示较高分辨率的目标视点对应的第二目标视点画面,用户又可以看到较高分辨率的视点画面。Specifically, if the target viewpoint is a virtual viewpoint, the decoding device combines the images in the main image frame sequence received by the main transmission path corresponding to the target viewpoint and the images adjacent to the target viewpoint intercepted by the secondary image frames in the secondary image frame sequence The image from the viewpoint is sent to the display device as the third video data, and the display device displays the second target viewpoint picture corresponding to the target viewpoint according to the third video data. Wherein, when the display device displays the second target viewpoint picture corresponding to the target viewpoint, the resolution of the viewpoint picture seen by the user is relatively high, that is, the display device displays the first target viewpoint picture corresponding to the lower resolution target viewpoint. Return to displaying the second target viewpoint picture corresponding to the higher resolution target viewpoint, and the user can see the higher resolution viewpoint picture again.
例如,如图10所示,视点C表示目标视点,图10中的视点C落在了P6视点与P7视点之间,即视点C为虚拟视点,且P7视点距离视点C最近,那么,解码设备将由P7视点 对应的主传输路径接收到的主图像帧序列中的图像和由从图像帧序列中的从图像帧中截取的P6视点和P7视点的图像作为第三视频数据发送给显示设备,显示设备根据从图像帧中截取的P6视点和P7视点的图像中的视点画面和视点深度图画面合成视点C对应的第二目标视点画面,并按照第二视频数据显示视点C对应的第二目标视点画面,即显示P7视点的主图像帧序列中的图像,此时设备显示的视点C对应的第二目标视点画面的分辨率是较高的。For example, as shown in Figure 10, the viewpoint C represents the target viewpoint, and the viewpoint C in Figure 10 falls between the P6 viewpoint and the P7 viewpoint, that is, the viewpoint C is a virtual viewpoint, and the P7 viewpoint is closest to the viewpoint C, then, the decoding device The images in the main image frame sequence received by the main transmission path corresponding to the P7 viewpoint and the images of the P6 viewpoint and P7 viewpoint intercepted from the secondary image frames in the secondary image frame sequence are sent to the display device as third video data, and displayed The device synthesizes the second target viewpoint picture corresponding to viewpoint C according to the viewpoint pictures and viewpoint depth map pictures in the P6 viewpoint and P7 viewpoint images intercepted from the image frame, and displays the second target viewpoint corresponding to viewpoint C according to the second video data The picture, that is, the image in the main image frame sequence of the P7 viewpoint is displayed. At this time, the resolution of the second target viewpoint picture corresponding to the viewpoint C displayed by the device is relatively high.
如图11所示,在本申请的第五实施例中,本申请的多视点视频数据处理方法,应用于编码设备,包括以下步骤:As shown in FIG. 11, in the fifth embodiment of the present application, the multi-viewpoint video data processing method of the present application is applied to a coding device, including the following steps:
步骤S310:获取各个摄像机拍摄的各个视点的图像。Step S310: Obtain images of each viewpoint captured by each camera.
在本实施例中,在进行视频拍摄时,需要提前在视频拍摄场地布置若干台摄像机,每台摄像机负责一个角度图像的拍摄,每台摄像机拍摄的一个角度的图像即为一个视点的图像,即不同摄像机拍摄不同视点对应的图像。拍摄的视频可以是直播视频,例如篮球赛事直播、足球赛事直播等,也可是录播视频,例如羽毛球录播视频等。In this embodiment, when video shooting is performed, several cameras need to be arranged in the video shooting site in advance, and each camera is responsible for shooting an image at an angle, and the image at an angle captured by each camera is an image at a viewpoint, namely Different cameras capture images corresponding to different viewpoints. The captured video can be a live video, such as a basketball game live broadcast, a football game live broadcast, etc., or a recorded video, such as a badminton recorded and broadcast video.
本实施例以直播视频为例进行说明,例如,篮球赛事直播视频。在拍摄篮球赛事直播视频时,需在赛事举办场地四周布置若干台摄像机,每台摄像机负责赛事一个角度图像的拍摄,每台摄像机拍摄的一个角度的图像即为一个视点的图像。如图3所示,1-9分别表示P1视点-P9视点,每个视点对应设置了一台摄像机,即P1摄像机-P9摄像机。P1摄像机-P9摄像机为拍摄本篮球赛事的9台摄像机,P1摄像机-P9摄像机分别负责拍摄一个视点的图像,其中,P1摄像机拍摄的图像为P1视点的图像,P2摄像机拍摄的图像为P2视点的图像,以此类推,P9摄像机拍摄的图像为P9视点的图像。其中,每个视点对应的图像包括视点画面和视点深度图画面中的至少一个,视点深度图画面也被称为距离影像(range image),是指将从图像采集器(如摄像机)到场景中各点的距离(深度)作为像素值的图像。具体的,编码设备获取各个摄像机拍摄的各个视点的图像,即编码设备获取的各个视点的图像包括视点画面和视点深度图画面中的至少一个。In this embodiment, a live video is taken as an example for description, for example, a live video of a basketball match. When shooting a live video of a basketball game, several cameras need to be arranged around the venue where the game is held, and each camera is responsible for shooting an image from an angle of the game, and the image from an angle captured by each camera is an image from a viewpoint. As shown in FIG. 3 , 1-9 represent viewpoints P1-P9 respectively, and each viewpoint is correspondingly provided with a camera, that is, camera P1-camera P9. Cameras P1-P9 are 9 cameras for shooting this basketball game. Cameras P1-P9 are respectively responsible for shooting images of a viewpoint. Among them, the image captured by the P1 camera is the image of the P1 viewpoint, and the image captured by the P2 camera is the image of the P2 viewpoint. Image, and so on, the image captured by the P9 camera is the image of the P9 viewpoint. Wherein, the image corresponding to each viewpoint includes at least one of a viewpoint picture and a viewpoint depth map picture, and the viewpoint depth map picture is also called a range image (range image), which refers to the An image of the distance (depth) of each point as a pixel value. Specifically, the encoding device acquires images of each viewpoint captured by each camera, that is, the images of each viewpoint acquired by the encoding device include at least one of a viewpoint picture and a perspective depth map picture.
步骤S320:对各个所述视点的图像进行拼接,并按照拍摄时间和第一分辨率对拼接后的图像进行编码,生成从图像帧序列。Step S320: Stitching the images of each of the viewpoints, encoding the spliced images according to the shooting time and the first resolution, and generating a secondary image frame sequence.
具体的,步骤S320包括:Specifically, step S320 includes:
采用预设排布方式对各个所述视点的图像进行拼接,生成拼接后的图像以及各个所述视点的图像在拼接后的图像中的排布信息;Stitching the images of each of the viewpoints in a preset arrangement manner, generating a spliced image and arrangement information of the images of each of the viewpoints in the spliced image;
按照所述拍摄时间将拼接后的图像进行排序,生成拼接图像序列;Sorting the stitched images according to the shooting time to generate a stitched image sequence;
将所述拼接图像序列按照第一分辨率进行编码,并采用所述排布信息对编码后的所述拼接图像序列进行标记,得到所述从图像帧序列。Encoding the spliced image sequence according to the first resolution, and marking the encoded spliced image sequence by using the arrangement information, to obtain the secondary image frame sequence.
所述排布信息至少包括各个所述视点的视点标识和各个所述视点的图像在拼接后的图像中的位置信息;The arrangement information includes at least the viewpoint identification of each viewpoint and the position information of the image of each viewpoint in the spliced image;
在本实施例中,编码设备获取到P1摄像机-P9摄像机拍摄的P1视点-P9视点下拍摄的图像之后,对P1视点-P9视点下拍摄的图像进行编码。编码设备采用预设排布方式将每个视点的图像先拼接在一起,生成拼接后的图像和各个视点的图像在拼接后的图像中的排布信息,该拼接后的图像由P1视点-P9视点在同一时间拍摄的图像拼接而成,可以理解为拼接后的图像是一张大图像,该张大图像分为了9张小图像。In this embodiment, the encoding device encodes the images captured at the P1 viewpoint-P9 viewpoint after acquiring the images captured at the P1 viewpoint-P9 viewpoint captured by the P1 camera-P9 camera. The encoding device uses a preset arrangement method to splice the images of each viewpoint first, and generates the spliced image and the arrangement information of the images of each viewpoint in the spliced image. The spliced image is composed of P1 viewpoint-P9 The images taken by the viewpoints at the same time are spliced together. It can be understood that the spliced image is a large image, and the large image is divided into 9 small images.
其中,排布信息至少包括各个视点的视点标识和各个视点的图像在拼接后的图像中的位置信息,预设排布方式如图12和图13所示,如果拼接后的图像中的各个视点的图像包括视点画面,那么各个视点的图像在拼接后的图像中的排布方式就是图12对应的方式;如果拼接后的图像中的各个视点的图像包括视点画面和视点深度图画面,那么各个视点的图像在拼接后的图像中的排布方式就是图13对应的方式。Among them, the layout information includes at least the viewpoint identification of each viewpoint and the position information of the image of each viewpoint in the spliced image. The preset layout method is shown in Figure 12 and Figure 13. If the image of each viewpoint includes a viewpoint picture, then the arrangement of the images of each viewpoint in the spliced image is the corresponding method in Figure 12; The arrangement of the viewpoint images in the spliced image is the one corresponding to FIG. 13 .
当各个视点的图像包括视点画面时,排布信息的格式是{x,y,w,h,view_id},x,y为视点画面左上角像素在拼接后的图像中的坐标,w,h为视点画面的宽和高,view_id为图像 或视点画面对应的视点标识,其中,x,y,w,h表示位置信息。当各个视点的图像包括视点画面和视点深度图画面时,排布信息的格式是{x,y,w,h,view_id,is_depth},x,y为视点画面或视点深度图画面左上角像素在拼接后的图像中的坐标,w,h为视点画面或视点深度图画面的宽和高,view_id为视点画面或视点深度图画面对应的视点标识,is_depth标记画面是否是视点深度图画面。其中,x,y,w,h,表示位置信息。其中,视点标识表示拼接后的图像中各个视点的图像具体是哪个视点的图像,例如,其中一张图像的视点标识是P9,则该图像就是是P9视点对应的图像,位置信息表示每个视点的图像具体排布在拼接后的图像中的哪个位置。When the image of each viewpoint includes a viewpoint picture, the format of the layout information is {x, y, w, h, view_id}, where x, y are the coordinates of the pixel in the upper left corner of the viewpoint picture in the spliced image, and w, h are The width and height of the viewpoint picture, view_id is the viewpoint identifier corresponding to the image or viewpoint picture, where x, y, w, and h represent the location information. When the images of each viewpoint include viewpoint images and viewpoint depth map images, the format of the layout information is {x, y, w, h, view_id, is_depth}, where x, y are the pixels in the upper left corner of the viewpoint images or viewpoint depth map images The coordinates in the spliced image, w, h are the width and height of the viewpoint picture or the viewpoint depth map picture, view_id is the viewpoint ID corresponding to the viewpoint picture or the viewpoint depth map picture, and is_depth marks whether the picture is a viewpoint depth map picture. Among them, x, y, w, h represent position information. Wherein, the viewpoint identifier indicates which viewpoint image the image of each viewpoint in the spliced image is, for example, if the viewpoint identifier of one of the images is P9, then the image is the image corresponding to the viewpoint of P9, and the position information represents each viewpoint Where in the spliced image are the specific images arranged.
编码设备生成拼接后的图像和各个视点的图像在拼接后的图像中的排布信息之后,按照拍摄时间将每张拼接后的图像进行排序,生成拼接图像序列。假设,按照拍摄时间的先后,生成了n张拼接后的图像,分别是图像1、图像2、图像3、......、图像n,那么,图像1-图像n一次排序后即为拼接图像序列。进而,再按照预设的第一分辨率对拼接图像序列进行编码,并采用排布信息对编码后的拼接图像序列进行标记,从而得到从图像帧序列。After the encoding device generates the stitched image and the arrangement information of the images of each viewpoint in the stitched image, each stitched image is sorted according to the shooting time to generate a stitched image sequence. Suppose, according to the order of shooting time, n spliced images are generated, which are image 1, image 2, image 3, ..., image n, then, after sorting image 1-image n once, it is Stitch image sequences. Furthermore, the sequence of spliced images is encoded according to the preset first resolution, and the sequence of encoded spliced images is marked by using the arrangement information, so as to obtain a sequence of secondary image frames.
进一步的,采用排布信息对编码后的拼接图像序列进行标记,得到从图像帧序列包括:可以在编码后的拼接图像序列的序列头中插入排布信息,得到从图像帧序列,也可以在编码后的拼接图像序列中的每一张拼接后的图像插入排布信息,得到从图像帧序列。将排布信息插入拼接图像序列的序列头中,解码设备可以只读取从图像帧序列的序列头中的排布信息就可以获取到从图像帧序列中每一张从图像帧中各个从视点的图像的位置信息,将排布信息插入每一张拼接后的图像,解码设备需要读取从图像帧序列中每一张从图像帧中的排布信息,才可以得到每一张从图像帧中各个从视点的图像的位置信息。其中,采用排布信息对编码后的拼接图像序列进行标记,方便解码设备根据排布信息在拼接后的图像中截取所需的图像,有利于提高图像的截取效率。Further, using the arrangement information to mark the coded spliced image sequence, obtaining the slave image frame sequence includes: inserting the arrangement information into the sequence header of the coded spliced image sequence to obtain the slave image frame sequence, or Arrangement information is inserted into each stitched image in the encoded stitched image sequence to obtain a slave image frame sequence. Insert the arrangement information into the sequence header of the spliced image sequence, and the decoding device can only read the arrangement information in the sequence header of the slave image frame sequence to obtain each slave viewpoint in each slave image frame in the slave image frame sequence The position information of the image, and insert the layout information into each spliced image, the decoding device needs to read the layout information of each slave image frame in the sequence of image frames to get each slave image frame The position information of the image from each viewpoint in . Wherein, the coded spliced image sequence is marked by using the arrangement information, which facilitates the decoding device to intercept the required image from the spliced images according to the arrangement information, and is beneficial to improve the image interception efficiency.
步骤S330:按照所述拍摄时间和第二分辨率对每个所述视点的图像进行编码,生成每个视点的主图像帧序列。Step S330: Encode the image of each viewpoint according to the shooting time and the second resolution, and generate a main image frame sequence of each viewpoint.
在本实施例中,编码设备生成从图像帧序列的同时,还按照拍摄时间和预设的第二分辨率对P1视点-P9视点中拍摄的各个视点的图像进行单独编码,生成每个视点的主图像帧序列,也就是按照拍摄时间和第二分辨率对P1视点下拍摄的图像进行单独编码,生成P1视点的主图像帧序列,按照拍摄时间和第二分辨率对P2视点下拍摄的图像进行单独编码,生成P2视点的主图像帧序列,以此类推,按照拍摄时间和第二分辨率对P9视点下拍摄的图像进行单独编码,生成P9视点的主图像帧序列,从而生成9个主图像帧序列。In this embodiment, while the encoding device generates the slave image frame sequence, it also separately encodes the images of each viewpoint captured from the P1 viewpoint to the P9 viewpoint according to the shooting time and the preset second resolution, and generates the image of each viewpoint The main image frame sequence, that is, separately encode the images shot under the P1 viewpoint according to the shooting time and the second resolution to generate the main image frame sequence of the P1 viewpoint, and encode the images shot under the P2 viewpoint according to the shooting time and the second resolution Separate encoding is performed to generate the main image frame sequence of P2 viewpoint, and so on, according to the shooting time and second resolution, the images shot under P9 viewpoint are separately encoded to generate the main image frame sequence of P9 viewpoint, thereby generating 9 main sequence of image frames.
具体的,步骤S330包括:Specifically, step S330 includes:
按照所述拍摄时间将每个所述视点的图像进行排序,生成每个所述视点的图像序列;Sorting the images of each of the viewpoints according to the shooting time to generate an image sequence of each of the viewpoints;
将每个所述视点的图像序列按照第二分辨率进行编码,得到所述主图像帧序列。The image sequence of each viewpoint is encoded according to the second resolution to obtain the main image frame sequence.
假设,按照拍摄时间的先后,编码设备获取到了P1视点-P9视点分别对应的n张图像,以P1视点对应的图像为例,按照拍摄时间对P1视点对应的n张图像排序后,生成的P1视点的图像序列为图像1、图像2、图像3、......、图像n,然后,按照预设的第二分辨率对P1视点的图像序列编码后,得到P1视点对应的主图像帧序列。P2视点-P9视点分别对应的n张图像的编码方式与P1视点分别对应的n张图像的编码方式相同,这里不在赘述。Assume that, according to the order of shooting time, the encoding device has obtained n images corresponding to the viewpoints P1-P9 respectively. Taking the images corresponding to the viewpoint P1 as an example, after sorting the n images corresponding to the viewpoint P1 according to the shooting time, the generated P1 The image sequence of the viewpoint is image 1, image 2, image 3, ..., image n, and then, after encoding the image sequence of the P1 viewpoint according to the preset second resolution, the main image corresponding to the P1 viewpoint is obtained sequence of frames. The encoding methods of the n images corresponding to the P2 viewpoint to the P9 viewpoint are the same as the encoding methods of the n images corresponding to the P1 viewpoint, and will not be repeated here.
值得注意的是,本实施例中的,第一分辨率表示拼接后的图像的总分辨率,拼接后的图像中的各个视点的图像的分辨率小于第二分辨率,也就是拼接后的图像中的各个视点的图像的分辨率小于每个视点的主图像帧序列中的图像的分辨率。It is worth noting that in this embodiment, the first resolution represents the total resolution of the stitched image, and the resolution of the images of each viewpoint in the stitched image is smaller than the second resolution, that is, the stitched image The resolution of the images of each viewpoint in is smaller than the resolution of the images in the main image frame sequence of each viewpoint.
进一步的,每个视点对应的图像序列中的第一张图像编码为I帧,I帧(I frame)又称为内部画面,I帧通常是每个GOP(MPEG所使用的一种视频压缩技术)的第一个帧,经过适度地压缩,做为随机访问的参考点,可以当成图像。对于多视点视频,由于视点切换只能在I帧处进行,因此本实施在对每个视点的图像进行编码时,将每个视点对应的图像序列中的第一张图像编码为I帧。由于每个视点的主图像帧序列中的第一张图像均是I帧,能够做 到随着视点的随机切换,从而实现切换显示高分辨率的目标视点对应的主图像帧序列,便于用户在显示设备端切换视点后,使得显示设备根据编码设备提供的从图像帧序列以及各个视点对应的主图像帧序列快速恢复显示高分辨率的视点画面,以保持用户长时间观看视频的清晰度。Further, the first image in the image sequence corresponding to each viewpoint is coded as an I frame, and an I frame (I frame) is also called an internal picture, and an I frame is usually a video compression technique used by each GOP (MPEG) ), the first frame of ) is moderately compressed and used as a reference point for random access, which can be regarded as an image. For multi-viewpoint video, since viewpoint switching can only be performed at I frames, this implementation encodes the first image in the image sequence corresponding to each viewpoint as an I frame when encoding images of each viewpoint. Since the first image in the main image frame sequence of each viewpoint is an I frame, it can be randomly switched with the viewpoint, thereby realizing switching and displaying the main image frame sequence corresponding to the high-resolution target viewpoint, which is convenient for the user After switching the viewpoint on the display device side, the display device quickly resumes displaying high-resolution viewpoint images according to the secondary image frame sequence provided by the encoding device and the main image frame sequence corresponding to each viewpoint, so as to maintain the clarity of the user watching the video for a long time.
步骤S340:接收到解码设备发送的视点选定指令时,根据视点选定指令获取所述解码设备选定的视点。Step S340: When receiving the viewpoint selection instruction sent by the decoding device, acquire the viewpoint selected by the decoding device according to the viewpoint selection instruction.
步骤S350将所述解码设备选定的视点的主图像帧序列由所述视点的主传输路径传输至所述解码设备,同时将所述从图像帧序列由从传输路径传输至所述解码设备。Step S350 transmits the main image frame sequence of the viewpoint selected by the decoding device to the decoding device through the main transmission path of the viewpoint, and simultaneously transmits the secondary image frame sequence to the decoding device through the secondary transmission path.
在本实施例中,编码设备生成从图像帧序列以及每个视点的主图像帧序列之后,编码设备根据接收到的解码设备发送的视点选定指令,获取解码设备选定的视点,然后将解码设备选定的视点的主图像帧序列由该视点的主传输路径传输至解码设备,同时将从图像帧序列由从传输路径传输至解码设备。In this embodiment, after the coding device generates the secondary image frame sequence and the main image frame sequence of each viewpoint, the coding device obtains the viewpoint selected by the decoding device according to the received viewpoint selection instruction sent by the decoding device, and then decodes the The main image frame sequence of the viewpoint selected by the device is transmitted to the decoding device through the main transmission path of the viewpoint, and the secondary image frame sequence is transmitted to the decoding device through the secondary transmission path.
编码设备将生成的各个视点的主图像帧序列分别通过一个独立的传输路径传输给解码设备,同时也将生成的从图像帧序列通过一个独立的传输路径传输给显示设备。为了便于理解,将传输各个视点的主图像帧序列的独立的传输路径传称为主传输路径,将传输从图像帧序列的独立的传输路径传称为从传输路径。如果有9个视点,即有9条主传输路径和1条从传输路径,每条主传输路径负责传输对应视点的主图像帧序列,例如,P1视点对应的主传输路径传输P1视点的主图像帧序列,从传输路径负责传输从图像帧序列。The encoding device transmits the generated main image frame sequence of each viewpoint to the decoding device through an independent transmission path, and simultaneously transmits the generated secondary image frame sequence to the display device through an independent transmission path. For ease of understanding, the independent transmission path for transmitting the main image frame sequence of each viewpoint is referred to as the main transmission path, and the independent transmission path for transmitting the slave image frame sequence is referred to as the slave transmission path. If there are 9 viewpoints, that is, there are 9 main transmission paths and 1 secondary transmission path, each main transmission path is responsible for transmitting the main image frame sequence of the corresponding viewpoint, for example, the main transmission path corresponding to the P1 viewpoint transmits the main image of the P1 viewpoint Frame sequence, the slave transmission path is responsible for transmitting the slave image frame sequence.
其中,视点选定指令是解码设备根据显示设备发送的显示指令以及画面切换条件满足指令中的任意一个生成的。例如,显示设备当前需要显示某一视点对应的视点画面,如果该视点是P2视点,则显示设备根据P2视点生成显示指令,将包括P2视点的显示指令发生给解码设备,解码设备根据显示指令获取到P2视点之后,生成包括P2视点的视点选定指令,并将包括P2视点的视点选定指令发送给编码设备,编码设备根据接收到的视点选定指令可以获取到P2视点,P2视点即为解码设备选定的视点,那么编码设备将P2视点的主图像帧序列通过P2视点对应的主传输路径传输给解码设备,同时将从图像帧序列通过从传输路径也传输给解码设备。Wherein, the viewpoint selection instruction is generated by the decoding device according to any one of the display instruction sent by the display device and the screen switching condition satisfaction instruction. For example, the display device currently needs to display a viewpoint picture corresponding to a certain viewpoint. If the viewpoint is a P2 viewpoint, the display device generates a display instruction according to the P2 viewpoint, and sends the display instruction including the P2 viewpoint to the decoding device, and the decoding device acquires it according to the display instruction. After arriving at the P2 viewpoint, generate a viewpoint selection instruction including the P2 viewpoint, and send the viewpoint selection instruction including the P2 viewpoint to the encoding device, and the encoding device can obtain the P2 viewpoint according to the received viewpoint selection instruction, and the P2 viewpoint is For the viewpoint selected by the decoding device, the encoding device transmits the main image frame sequence of the P2 viewpoint to the decoding device through the main transmission path corresponding to the P2 viewpoint, and simultaneously transmits the secondary image frame sequence to the decoding device through the secondary transmission path.
又例如,用户通过显示设备将当前视点P2切换到了目标视点P4,P4视点对应的视点画面为即将显示的视点画面,显示设备确定满足P4视点对应的视点画面切换条件时,生成包括P4视点的画面切换条件满足指令,并将包括P4视点的画面切换条件满足指令发生给解码设备,解码设备根据画面切换条件满足指令获取到P4视点之后,生成包括P4视点的视点选定指令,并将包括P4视点的视点选定指令发送给编码设备,编码设备根据接收到的视点选定指令可以获取到P4视点,P4视点即为解码设备选定的视点,那么编码设备将P4视点的主图像帧序列通过P4视点对应的主传输路径传输给解码设备。For another example, the user switches the current viewpoint P2 to the target viewpoint P4 through the display device, and the viewpoint picture corresponding to the P4 viewpoint is the viewpoint picture to be displayed. When the display device determines that the viewpoint picture switching condition corresponding to the P4 viewpoint is met, a picture including the P4 viewpoint is generated. The switching condition meets the instruction, and the screen switching condition satisfying instruction including the P4 viewpoint is sent to the decoding device. After the decoding device acquires the P4 viewpoint according to the screen switching condition satisfying instruction, it generates a viewpoint selection instruction including the P4 viewpoint, and will include the P4 viewpoint. The viewpoint selection command sent to the encoding device, the encoding device can obtain the P4 viewpoint according to the received viewpoint selection instruction, and the P4 viewpoint is the viewpoint selected by the decoding device, then the encoding device will pass the main image frame sequence of the P4 viewpoint The main transmission path corresponding to the viewpoint is transmitted to the decoding device.
本实施例根据上述技术方案,为显示设备提供了视点画面显示所需的从图像帧序列以及各个视点对应的主图像帧序列,便于用户在显示设备端切换视点后,使得显示设备根据编码设备提供的从图像帧序列以及各个视点对应的主图像帧序列快速恢复显示高分辨率的视点画面,以保持用户长时间观看视频的清晰度。According to the above technical solution, this embodiment provides the display device with the required secondary image frame sequence and the main image frame sequence corresponding to each viewpoint for the display device, so that after the user switches the viewpoint at the display device end, the display device provides The slave image frame sequence and the main image frame sequence corresponding to each viewpoint quickly restore and display high-resolution viewpoint images to maintain the clarity of the user watching the video for a long time.
基于上述步骤S310-步骤S350,本实施例按照下述例子对编码设备进行说明,具体如下:Based on the above step S310-step S350, this embodiment describes the encoding device according to the following example, specifically as follows:
例如,9个摄像机分别拍摄到的P1视点-P9视点的图像,经过编码后,P1视点-P9视点分别对应的主图像帧序列分别是主图像帧序列F1、主图像帧序列F2、.....、主图像帧序列F9,从图像帧序列是从图像帧序列F0,从图像帧序列F0中的每张拼接后的图像包括P1视点-P9视点的图像,分别是图像f1-图像f9,P1视点的主传输路径是路径1,P2视点的主传输路径是路径2、.....、P9视点的主传输路径是路径9,从传输路径是路径0。如果解码设备选定的视点是P1视点,那么编码设备将主图像帧序列F1通过路径1传输至解码设备;如果解码设备选定的视点是P5视点,那么编码设备将主图像帧序列F5通过路径5传输至解码设备;如果解码设备选定的视点是虚拟视点,该虚拟视点在P3视点和P4视点之间,且 该虚拟视点距离P4视点最近,则编码设备将主图像帧序列F4通过路径4传输至解码设备,同时也将从图像帧序列F0通过路径0传输至解码设备。For example, the images of the P1 viewpoint-P9 viewpoint captured by 9 cameras respectively, after encoding, the main image frame sequences corresponding to the P1 viewpoint-P9 viewpoint respectively are the main image frame sequence F1, the main image frame sequence F2, ... .., the main image frame sequence F9, the secondary image frame sequence is the secondary image frame sequence F0, and each spliced image in the secondary image frame sequence F0 includes the images of the P1 viewpoint-P9 viewpoint, which are respectively image f1-image f9, The main transmission path of the P1 viewpoint is path 1, the main transmission path of the P2 viewpoint is path 2, ..., the main transmission path of the P9 viewpoint is path 9, and the secondary transmission path is path 0. If the viewpoint selected by the decoding device is the P1 viewpoint, then the encoding device transmits the main image frame sequence F1 to the decoding device through path 1; if the viewpoint selected by the decoding device is the P5 viewpoint, then the encoding device transmits the main image frame sequence F5 through the path 5. Transmission to the decoding device; if the viewpoint selected by the decoding device is a virtual viewpoint, the virtual viewpoint is between the P3 viewpoint and the P4 viewpoint, and the virtual viewpoint is the closest to the P4 viewpoint, then the encoding device passes the main image frame sequence F4 through the path 4 It is transmitted to the decoding device, and at the same time, the image frame sequence F0 is transmitted to the decoding device through path 0.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on such an understanding, the technical solution of the present application can be embodied in the form of a software product in essence or the part that contributes to the prior art, and the computer software product is stored in a storage medium as described above (such as ROM/RAM , magnetic disk, optical disk), including several instructions to make a terminal device (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) execute the methods described in various embodiments of the present application.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only preferred embodiments of the present application, and are not intended to limit the patent scope of the present application. All equivalent structures or equivalent process transformations made by using the description of the application and the accompanying drawings are directly or indirectly used in other related technical fields. , are all included in the patent protection scope of the present application in the same way.

Claims (15)

  1. 一种多视点视频数据处理方法,其中,应用于解码设备,所述多视点视频数据处理方法包括:A method for processing multi-viewpoint video data, wherein, applied to a decoding device, the method for processing multi-viewpoint video data includes:
    接收到显示设备发送的显示指令时,获取所述显示指令对应的当前视点;When receiving a display instruction sent by a display device, acquiring a current viewpoint corresponding to the display instruction;
    发送所述显示设备显示所述当前视点对应的当前视点画面所需的第一视频数据至所述显示设备;所述第一视频数据包括由所述当前视点对应的主传输路径接收到的主图像帧序列中的图像和/或由从传输路径接收到的从图像帧序列中的从图像帧中截取的所述当前视点对应的从视点的图像,所述主图像帧序列中的图像和所述从图像帧中的图像均包括视点画面和/或视点深度图画面,且所述主图像帧序列中的图像的分辨率大于所述从图像帧中的图像的分辨率;Sending the first video data required by the display device to display the current viewpoint picture corresponding to the current viewpoint to the display device; the first video data includes the main image received by the main transmission path corresponding to the current viewpoint The image in the frame sequence and/or the image of the secondary viewpoint corresponding to the current viewpoint intercepted from the secondary image frame in the secondary image frame sequence received from the transmission path, the image in the main image frame sequence and the The images in the slave image frames all include viewpoint pictures and/or viewpoint depth map pictures, and the resolution of the images in the main image frame sequence is greater than the resolution of the images in the slave image frames;
    接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点;When receiving the viewpoint switching instruction sent by the display device, acquiring a target viewpoint corresponding to the viewpoint switching instruction;
    发送所述显示设备显示所述目标视点对应的第一目标视点画面所需的第二视频数据至所述显示设备;所述第二视频数据包括由所述从图像帧序列中的从图像帧中截取的所述目标视点对应的从视点的图像;Sending the second video data required by the display device to display the first target viewpoint picture corresponding to the target viewpoint to the display device; the second video data includes the slave image frame in the slave image frame sequence An image from a viewpoint corresponding to the intercepted target viewpoint;
    接收到所述显示设备发送的画面切换条件满足指令时,发送所述显示设备显示所述目标视点对应的第二目标视点画面所需的第三视频数据至所述显示设备;所述第三视频数据包括由所述目标视点对应的主传输路径接收到的主图像帧序列中的图像和/或由所述从图像帧序列中的从图像帧中截取的所述目标视点对应的从视点的图像。When receiving the screen switching condition satisfaction instruction sent by the display device, sending the third video data required by the display device to display the second target viewpoint picture corresponding to the target viewpoint to the display device; the third video The data includes an image in the main image frame sequence received by the main transmission path corresponding to the target viewpoint and/or an image of the secondary viewpoint corresponding to the target viewpoint intercepted from a secondary image frame in the secondary image frame sequence .
  2. 如权利要求1所述的方法,其中,所述发送所述显示设备显示所述当前视点对应的当前视点画面所需的第一视频数据至所述显示设备的步骤之前,还包括:The method according to claim 1, wherein, before the step of sending the first video data required by the display device to display the current viewpoint picture corresponding to the current viewpoint to the display device, further comprising:
    获取所述当前视点对应的从视点的视点标识以及所述从图像帧序列的排布信息;Acquire the viewpoint identifier of the secondary viewpoint corresponding to the current viewpoint and the arrangement information of the secondary image frame sequence;
    根据所述排布信息以及所述视点标识确定所述当前视点对应的从视点的图像在所述从图像帧中的位置信息;determining the position information of the secondary viewpoint image corresponding to the current viewpoint in the secondary image frame according to the arrangement information and the viewpoint identifier;
    由所述从图像帧序列中的从图像帧中截取所述位置信息对应的图像。An image corresponding to the location information is intercepted from a secondary image frame in the sequence of secondary image frames.
  3. 如权利要求1所述的方法,其中,所述发送所述显示设备显示所述当前视点对应的当前视点画面所需的第一视频数据至所述显示设备的步骤包括:The method according to claim 1, wherein the step of sending the first video data required by the display device to display the current viewpoint picture corresponding to the current viewpoint to the display device comprises:
    所述当前视点不为虚拟视点,发送由所述当前视点对应的主传输路径接收到的主图像帧序列中的图像至所述显示设备。The current viewpoint is not a virtual viewpoint, and the images in the main image frame sequence received by the main transmission path corresponding to the current viewpoint are sent to the display device.
  4. 如权利要求3所述的方法,其中,所述判断所述当前视点是否为虚拟视点的步骤之后还包括:The method according to claim 3, wherein, after the step of judging whether the current viewpoint is a virtual viewpoint, it further comprises:
    所述当前视点为虚拟视点,发送由所述当前视点对应的主传输路径接收到的主图像帧序列中的图像和由从传输路径接收到的从图像帧序列中的从图像帧中截取的所述当前视点对应的从视点的图像至所述显示设备;所述当前视点对应的从视点包括所述当前视点相邻的从视点。The current viewpoint is a virtual viewpoint, and the images in the main image frame sequence received by the main transmission path corresponding to the current viewpoint and the images intercepted from the secondary image frame sequences in the secondary image frame sequence received by the secondary transmission path are sent. The image of the secondary viewpoint corresponding to the current viewpoint is sent to the display device; the secondary viewpoint corresponding to the current viewpoint includes secondary viewpoints adjacent to the current viewpoint.
  5. 如权利要求1所述的方法,其中,所述发送所述显示设备显示所述目标视点对应的第一目标视点画面所需的第二视频数据至所述显示设备步骤包括:The method according to claim 1, wherein the step of sending the second video data required by the display device to display the first target viewpoint picture corresponding to the target viewpoint to the display device comprises:
    所述目标视点不为虚拟视点,发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相同的从视点的图像至所述显示设备。The target viewpoint is not a virtual viewpoint, and an image of a slave viewpoint identical to the target viewpoint intercepted from a slave image frame in the sequence of slave image frames is sent to the display device.
  6. 如权利要求5所述的方法,其中,所述所述目标视点不为虚拟视点,发送由所述从 图像帧序列中的从图像帧中截取的与所述目标视点相同的从视点的图像至所述显示设备的步骤之后,所述接收到所述显示设备发送的画面切换条件满足指令时,发送所述显示设备显示所述目标视点对应的第二目标视点画面所需的第三视频数据至所述显示设备步骤包括:The method according to claim 5, wherein the target viewpoint is not a virtual viewpoint, and the image of the slave viewpoint identical to the target viewpoint intercepted from the slave image frame in the sequence of slave image frames is sent to After the step of the display device, when receiving the screen switching condition satisfaction instruction sent by the display device, sending the third video data required by the display device to display the second target viewpoint picture corresponding to the target viewpoint to The display device steps include:
    接收到所述显示设备发送的画面切换条件满足指令时,发送由所述目标视点对应的主传输路径接收到的主图像帧序列中的图像至所述显示设备。When the screen switching condition satisfaction instruction sent by the display device is received, the images in the main image frame sequence received by the main transmission path corresponding to the target viewpoint are sent to the display device.
  7. 如权利要求5所述的方法,其中,所述判断所述目标视点是否为虚拟视点的步骤之后还包括:The method according to claim 5, wherein, after the step of judging whether the target viewpoint is a virtual viewpoint, it further comprises:
    所述目标视点为虚拟视点,发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相邻的从视点的图像至所述显示设备。The target viewpoint is a virtual viewpoint, and an image of a slave viewpoint adjacent to the target viewpoint intercepted from a slave image frame in the sequence of slave image frames is sent to the display device.
  8. 如权利要求7所述的方法,其中,所述所述目标视点为虚拟视点,发送由所述从图像帧序列中的从图像帧中截取的与所述目标视点相邻的从视点的图像至所述显示设备的步骤之后,所述接收到所述显示设备发送的画面切换条件满足指令时,发送所述显示设备显示所述目标视点对应的第二目标视点画面所需的第三视频数据至所述显示设备步骤还包括:The method according to claim 7, wherein the target viewpoint is a virtual viewpoint, and an image of a slave viewpoint adjacent to the target viewpoint intercepted from a slave image frame in the sequence of slave image frames is sent to After the step of the display device, when receiving the screen switching condition satisfaction instruction sent by the display device, sending the third video data required by the display device to display the second target viewpoint picture corresponding to the target viewpoint to The display device step also includes:
    接收到所述显示设备发送的画面切换条件满足指令时,发送由所述目标视点对应的主传输路径接收到的主图像帧序列中的图像和由所述从图像帧序列中的从图像帧中截取的与所述目标视点相邻的从视点的图像至所述显示设备至所述显示设备。When the screen switching condition satisfaction instruction sent by the display device is received, the image in the main image frame sequence received by the main transmission path corresponding to the target viewpoint and the image in the secondary image frame sequence received by the secondary image frame sequence are sent The intercepted images from viewpoints adjacent to the target viewpoint are sent to the display device to the display device.
  9. 一种多视点视频数据处理方法,其中,应用于编码设备,所述多视点视频数据处理方法包括:A method for processing multi-viewpoint video data, wherein, applied to a coding device, the method for processing multi-viewpoint video data includes:
    获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像,所述图像包括视点画面和视点深度图画面中的至少一个;Acquiring images of various viewpoints taken by each camera, where different cameras take images corresponding to different viewpoints, and the images include at least one of a viewpoint picture and a viewpoint depth map picture;
    对各个所述视点的图像进行拼接,并按照拍摄时间和第一分辨率对拼接后的图像进行编码,生成从图像帧序列;Stitching the images of each viewpoint, and encoding the spliced images according to the shooting time and the first resolution, to generate a sequence of secondary image frames;
    按照所述拍摄时间和第二分辨率对每个所述视点的图像进行编码,生成每个视点的主图像帧序列,所述主图像帧序列中的图像的分辨率大于拼接后的图像中图像的分辨率;Encode the images of each viewpoint according to the shooting time and the second resolution to generate a main image frame sequence of each viewpoint, the resolution of the images in the main image frame sequence is greater than that of the images in the spliced images resolution;
    接收到解码设备发送的视点选定指令时,根据视点选定指令获取所述解码设备选定的视点;When receiving the viewpoint selection instruction sent by the decoding device, acquire the viewpoint selected by the decoding device according to the viewpoint selection instruction;
    将所述解码设备选定的视点的主图像帧序列由所述视点的主传输路径传输至所述解码设备,同时将所述从图像帧序列由从传输路径传输至所述解码设备。The main image frame sequence of the viewpoint selected by the decoding device is transmitted to the decoding device through the main transmission path of the viewpoint, and at the same time, the secondary image frame sequence is transmitted to the decoding device through the secondary transmission path.
  10. 如权利要求9所述的方法,其中,所述对各个所述视点的图像进行拼接,并按照拍摄时间和第一分辨率对拼接后的图像进行编码,生成从图像帧序列的步骤包括:The method according to claim 9, wherein said splicing the images of each said viewpoint, and encoding the spliced images according to the shooting time and the first resolution, and the step of generating a sequence of secondary image frames comprises:
    采用预设排布方式对各个所述视点的图像进行拼接,生成拼接后的图像以及各个所述视点的图像在拼接后的图像中的排布信息,所述排布信息至少包括各个所述视点的视点标识和各个所述视点的图像在拼接后的图像中的位置信息;Stitching the images of each of the viewpoints in a preset arrangement manner to generate a spliced image and arrangement information of the images of each of the viewpoints in the spliced image, the arrangement information at least including each of the viewpoints The viewpoint identification and the position information of the image of each viewpoint in the spliced image;
    按照所述拍摄时间将拼接后的图像进行排序,生成拼接图像序列;Sorting the stitched images according to the shooting time to generate a stitched image sequence;
    将所述拼接图像序列按照第一分辨率进行编码,并采用所述排布信息对编码后的所述拼接图像序列进行标记,得到所述从图像帧序列。Encoding the spliced image sequence according to the first resolution, and marking the encoded spliced image sequence by using the arrangement information, to obtain the secondary image frame sequence.
  11. 如权利要求10所述的方法,其中,所述采用所述排布信息对编码后的所述拼接图像序列进行标记,得到所述从图像帧序列的步骤包括:The method according to claim 10, wherein the step of marking the coded sequence of stitched images by using the arrangement information, and obtaining the sequence of secondary image frames comprises:
    在编码后的所述拼接图像序列的序列头中插入所述排布信息,得到所述从图像帧序列。The arrangement information is inserted into the sequence header of the coded spliced image sequence to obtain the secondary image frame sequence.
  12. 如权利要求10所述的方法,其中,所述采用所述排布信息对编码后的所述拼接图像序列进行标记,得到所述从图像帧序列的步骤还包括:The method according to claim 10, wherein the step of marking the encoded sequence of stitched images by using the arrangement information, and obtaining the sequence of secondary image frames further comprises:
    在编码后的所述拼接图像序列中的每一张拼接后的图像插入所述排布信息,得到所述从图像帧序列。Inserting the arrangement information into each stitched image in the encoded stitched image sequence to obtain the slave image frame sequence.
  13. 一种解码设备,其中,所述解码设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的多视点视频数据处理程序,所述多视点视频数据处理程序被所述处理器执行时实现如权利要求1-8中任一项所述的多视点视频数据处理方法的步骤。A decoding device, wherein the decoding device includes: a memory, a processor, and a multi-viewpoint video data processing program stored in the memory and operable on the processor, the multi-viewpoint video data processing program being executed by The processor implements the steps of the method for processing multi-viewpoint video data according to any one of claims 1-8 when executed.
  14. 一种编码设备,其中,所述编码设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的多视点视频数据处理程序,所述多视点视频数据处理程序被所述处理器执行时实现如权利要求9-12中任一项所述的多视点视频数据处理方法的步骤。An encoding device, wherein the encoding device includes: a memory, a processor, and a multi-viewpoint video data processing program stored in the memory and operable on the processor, the multi-viewpoint video data processing program being executed by The processor implements the steps of the method for processing multi-viewpoint video data according to any one of claims 9-12 when executed.
  15. 一种存储介质,其中,其上存储有多视点视频数据处理程序,所述多视点视频数据处理程序被处理器执行时实现权利要求1-12中任一项所述的多视点视频数据处理方法的步骤。A storage medium, wherein a multi-view video data processing program is stored thereon, and when the multi-view video data processing program is executed by a processor, the multi-view video data processing method according to any one of claims 1-12 is realized A step of.
PCT/CN2021/134319 2021-09-02 2021-11-30 Multi-viewpoint video data processing method, device, and storage medium WO2023029252A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111035779.7A CN113949884A (en) 2021-09-02 2021-09-02 Multi-view video data processing method, device and storage medium
CN202111035779.7 2021-09-02

Publications (1)

Publication Number Publication Date
WO2023029252A1 true WO2023029252A1 (en) 2023-03-09

Family

ID=79328014

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/134319 WO2023029252A1 (en) 2021-09-02 2021-11-30 Multi-viewpoint video data processing method, device, and storage medium

Country Status (2)

Country Link
CN (1) CN113949884A (en)
WO (1) WO2023029252A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116612168A (en) * 2023-04-20 2023-08-18 北京百度网讯科技有限公司 Image processing method, device, electronic equipment, image processing system and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108810636A (en) * 2017-04-28 2018-11-13 华为技术有限公司 Video broadcasting method, equipment and system
US20200195997A1 (en) * 2017-09-12 2020-06-18 Panasonic Intellectual Property Corporation Of America Image display method, image distribution method, image display apparatus, and image distribution apparatus
CN111447461A (en) * 2020-05-20 2020-07-24 上海科技大学 Synchronous switching method, device, equipment and medium for multi-view live video
CN111866525A (en) * 2020-09-23 2020-10-30 腾讯科技(深圳)有限公司 Multi-view video playing control method and device, electronic equipment and storage medium
CN113259770A (en) * 2021-05-11 2021-08-13 北京奇艺世纪科技有限公司 Video playing method, device, electronic equipment, medium and product

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101399951B1 (en) * 2013-07-25 2014-06-17 주식회사 넥스트이온 Multi-view video steaming system and providing method thereof
CN106919248A (en) * 2015-12-26 2017-07-04 华为技术有限公司 It is applied to the content transmission method and equipment of virtual reality
US11006135B2 (en) * 2016-08-05 2021-05-11 Sony Corporation Image processing apparatus and image processing method
CN109672897B (en) * 2018-12-26 2021-03-16 北京数码视讯软件技术发展有限公司 Panoramic video coding method and device
CN113099245B (en) * 2021-03-04 2023-07-25 广州方硅信息技术有限公司 Panoramic video live broadcast method, system and computer readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108810636A (en) * 2017-04-28 2018-11-13 华为技术有限公司 Video broadcasting method, equipment and system
US20200195997A1 (en) * 2017-09-12 2020-06-18 Panasonic Intellectual Property Corporation Of America Image display method, image distribution method, image display apparatus, and image distribution apparatus
CN111447461A (en) * 2020-05-20 2020-07-24 上海科技大学 Synchronous switching method, device, equipment and medium for multi-view live video
CN111866525A (en) * 2020-09-23 2020-10-30 腾讯科技(深圳)有限公司 Multi-view video playing control method and device, electronic equipment and storage medium
CN113259770A (en) * 2021-05-11 2021-08-13 北京奇艺世纪科技有限公司 Video playing method, device, electronic equipment, medium and product

Also Published As

Publication number Publication date
CN113949884A (en) 2022-01-18

Similar Documents

Publication Publication Date Title
EP3557845B1 (en) Method and device for transmitting panoramic videos, terminal, server and system
CN107040794A (en) Video broadcasting method, server, virtual reality device and panoramic virtual reality play system
US20120224025A1 (en) Transport stream structure including image data and apparatus and method for transmitting and receiving image data
US20050248802A1 (en) Image data delivery system, image data transmitting device thereof, and image data receiving device thereof
CN113099245B (en) Panoramic video live broadcast method, system and computer readable storage medium
US20200145736A1 (en) Media data processing method and apparatus
US11694303B2 (en) Method and apparatus for providing 360 stitching workflow and parameter
CN109698949B (en) Video processing method, device and system based on virtual reality scene
US10958950B2 (en) Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
CN113163214A (en) Video processing method and device
CN104335243A (en) Processing panoramic pictures
WO2023029252A1 (en) Multi-viewpoint video data processing method, device, and storage medium
WO2019048733A1 (en) Transmission of video content based on feedback
CN112351307A (en) Screenshot method, server, terminal equipment and computer readable storage medium
CN106412617B (en) Remote debugging control method and device
CN107707830B (en) Panoramic video playing and photographing system based on one-way communication
CN110730340A (en) Lens transformation-based virtual auditorium display method, system and storage medium
US20210218908A1 (en) Method for Processing Media Data, Client, and Server
CN111726598B (en) Image processing method and device
CN113905186B (en) Free viewpoint video picture splicing method, terminal and readable storage medium
US20220303518A1 (en) Code stream processing method and device, first terminal, second terminal and storage medium
CN114040184A (en) Image display method, system, storage medium and computer program product
EP4044584A1 (en) Panoramic video generation method, video acquisition method, and related apparatuses
WO2023029207A1 (en) Video data processing method, decoding device, encoding device, and storage medium
JP2023085913A (en) Moving image distribution system, moving image distribution device, method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21955780

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE