CN117241105A

CN117241105A - Media information processing method and device and storage medium

Info

Publication number: CN117241105A
Application number: CN202210642307.6A
Authority: CN
Inventors: 陈奇; 王魏强; 张晓渠
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2022-06-08
Filing date: 2022-06-08
Publication date: 2023-12-15
Also published as: WO2023236666A1

Abstract

This application discloses a media information processing method, device and storage medium thereof. Among them, a media information processing method includes: receiving multiple media information streams; obtaining the first display timestamp of the received target media information packet; using the first display timestamp as the starting display timestamp of each media information stream ;Fragment each media information stream according to the starting display timestamp to obtain multiple media fragmentation information. Among them, the media fragmentation information corresponds to a fragmentation sequence number, and all media fragmentation information with the same fragmentation sequence number has The same media duration; aggregate all target media fragmentation information to obtain free-view media fragmentation information, where the target media fragmentation information is media fragmentation information with the same fragmentation sequence number. In the embodiments of the present application, users can achieve seamless switching between free viewpoints, improve the user's video experience, and thus fill in the technical gaps in related methods.

Description

Media information processing method and device and storage medium

技术领域Technical field

本申请涉及视频技术领域，尤其是一种媒体信息处理方法及其装置、计算机存储介质。The present application relates to the field of video technology, and in particular, to a media information processing method, device, and computer storage medium.

背景技术Background technique

随着5G技术和高速互联网的快速发展，元宇宙和全真互联网迅速来临，沉浸媒体应用得到快速发展，目前创新的自由视点技术可以让观众自由选择任意时刻的360度任意观看视角，提升用户沉浸式的体验感，用户在观看视频的过程中可以自由切换视角，但是由于多机位拍摄的同一时刻画面各视角的视频流到达媒体服务器会存在较大的时间差，因此无法确保整体画面的良好画质，大大的影响了用户的体验效果。With the rapid development of 5G technology and high-speed Internet, the rapid arrival of Metaverse and Quanzhen Internet, immersive media applications have developed rapidly. The current innovative free viewpoint technology allows viewers to freely choose any 360-degree viewing angle at any time, improving user immersion. The user can freely switch the viewing angle while watching the video. However, due to the large time difference between the video streams of different viewing angles shot by multiple cameras at the same time reaching the media server, it is impossible to ensure a good picture of the overall picture. Quality, greatly affects the user experience.

发明内容Contents of the invention

本申请实施例提供了一种媒体信息处理方法及其装置、计算机存储介质，能够提升用户的视频体验效果。Embodiments of the present application provide a media information processing method, device, and computer storage medium, which can improve the user's video experience.

第一方面，本申请实施例提供了一种媒体信息处理方法，包括：In a first aspect, embodiments of the present application provide a media information processing method, including:

接收多个媒体信息流，其中，所述媒体信息流包括多个媒体信息包；Receive multiple media information streams, wherein the media information streams include multiple media information packets;

获取接收到的目标媒体信息包的第一显示时间戳，其中，所述目标媒体信息包为所有所述媒体信息包中第一个被接收的媒体信息包；Obtaining the first display timestamp of the received target media information packet, wherein the target media information packet is the first received media information packet among all the media information packets;

将所述第一显示时间戳作为各个所述媒体信息流的起始显示时间戳；Use the first display timestamp as the starting display timestamp of each media information stream;

根据所述起始显示时间戳对各个所述媒体信息流进行信息分片，得到各个所述媒体信息流的多个媒体分片信息，其中，所述媒体分片信息对应有分片序号，具有相同的所述分片序号的所有所述媒体分片信息具有相同的媒体时长；Information fragmentation is performed on each of the media information streams according to the starting display timestamp to obtain multiple media fragmentation information of each of the media information streams, wherein the media fragmentation information corresponds to a fragmentation sequence number, with All the media fragment information with the same fragment sequence number have the same media duration;

将所有所述媒体信息流中的目标媒体分片信息进行聚合，得到自由视点媒体分片信息，其中，所述目标媒体分片信息为具有相同的所述分片序号的所述媒体分片信息。Aggregate the target media fragmentation information in all the media information streams to obtain free view media fragmentation information, where the target media fragmentation information is the media fragmentation information with the same fragmentation sequence number. .

第二方面，本申请实施例还提供了一种媒体信息处理装置，包括：存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现如前面所述的媒体信息处理方法。In a second aspect, embodiments of the present application also provide a media information processing device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program Implement the media information processing method as described previously.

第三方面，本申请实施例还提供了一种计算机可读存储介质，存储有计算机可执行指令，所述计算机可执行指令用于执行如前面所述的媒体信息处理方法。In a third aspect, embodiments of the present application also provide a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are used to execute the media information processing method as described above.

本申请实施例中，通过将获取到的目标媒体信息包的第一显示时间戳统一设置为各个媒体信息流的起始显示时间戳，以解决各个媒体信息流同一时刻画面到达媒体服务器不一致的缺陷，进而在这种情况下根据起始显示时间戳对各个媒体信息流进行信息分片得到多个媒体分片信息，将所有媒体信息流中的具有相同的分片序号的媒体分片信息进行聚合，以得到完整的自由视点媒体分片信息，从而在保证画质的同时，避免视频画面在用户进行视角切换的过程中出现大范围地空间跳跃；因此，本申请实施例使得用户能够实现自由视点间的无缝切换，提升用户的视频体验效果，从而可以弥补相关方法中的技术空白。In the embodiment of the present application, the first display timestamp of the obtained target media information packet is uniformly set as the starting display timestamp of each media information stream, so as to solve the problem of inconsistent images of each media information stream arriving at the media server at the same time. , and then in this case, perform information fragmentation on each media information stream according to the starting display timestamp to obtain multiple media fragmentation information, and aggregate the media fragmentation information with the same fragmentation sequence number in all media information streams. , to obtain complete free-viewpoint media fragmentation information, thereby ensuring image quality while avoiding large-scale spatial jumps in the video picture when the user switches perspectives; therefore, embodiments of the present application enable users to achieve free-viewpoint Seamless switching between videos can improve the user's video experience, thus filling the technical gaps in related methods.

附图说明Description of drawings

图1是本申请一个实施例提供的媒体信息处理方法的流程图；Figure 1 is a flow chart of a media information processing method provided by an embodiment of the present application;

图2a是本申请一个实施例提供的多个媒体信息流在进行对齐前的示意图；Figure 2a is a schematic diagram of multiple media information streams before alignment provided by an embodiment of the present application;

图2b是本申请一个实施例提供的多个媒体信息流在进行对齐后的示意图；Figure 2b is a schematic diagram of multiple media information streams after alignment provided by an embodiment of the present application;

图3是本申请另一个实施例提供的媒体信息处理方法中，得到各个媒体信息流的多个媒体分片信息的流程图；Figure 3 is a flow chart for obtaining multiple media fragmentation information of each media information stream in the media information processing method provided by another embodiment of the present application;

图4是本申请一个实施例提供的媒体信息处理方法中，得到各个媒体信息流的多个媒体分片信息之前的流程图；Figure 4 is a flow chart before obtaining multiple media fragmentation information of each media information stream in the media information processing method provided by an embodiment of the present application;

图5是本申请一个实施例提供的媒体信息处理方法中，得到自由视点媒体分片信息的流程图；Figure 5 is a flow chart for obtaining free-view media fragmentation information in the media information processing method provided by an embodiment of the present application;

图6是本申请一个实施例提供的用于执行媒体信息处理方法的媒体服务器的示意图；Figure 6 is a schematic diagram of a media server for executing a media information processing method provided by an embodiment of the present application;

图7是本申请本申请一个实施例提供的对齐模块执行媒体信息处理方法的流程图；Figure 7 is a flow chart of a media information processing method performed by an alignment module provided by an embodiment of the present application;

图8是本申请另一个实施例提供的多个媒体信息流的示意图；Figure 8 is a schematic diagram of multiple media information flows provided by another embodiment of the present application;

图9是本申请一个实施例提供的拼接模块执行媒体信息处理方法的流程图；Figure 9 is a flow chart of a media information processing method performed by a splicing module provided by an embodiment of the present application;

图10是本申请一个实施例提供的媒体信息处理装置的示意图。Figure 10 is a schematic diagram of a media information processing device provided by an embodiment of the present application.

具体实施方式Detailed ways

为了使本申请的目的、技术方法及优点更加清楚明白，以下结合附图及实施例，对本申请进行进一步详细说明。应当理解，此处所描述的具体实施例仅用以解释本申请，并不用于限定本申请。In order to make the purpose, technical methods and advantages of the present application clearer, the present application will be further described in detail below with reference to the drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application and are not used to limit the present application.

需要说明的是，虽然在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于流程图中的顺序执行所示出或描述的步骤。说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。It should be noted that although a logical sequence is shown in the flowchart, in some cases, the steps shown or described may be performed in an order different from that in the flowchart. The terms "first", "second", etc. in the description, claims, and above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific sequence or sequence.

目前，为了改善多机位拍摄的同一时刻画面各视角的视频流到达媒体服务器会存在较大的时间差的问题，现有技术将各个机位的视频流压缩后拼接成一幅超高分辨率的大图画面，再进行图像校正，这样对用户的网络带宽要求比较高，且随着机位数的增多，为了适配用户播放器的分辨率，需要降低机位的分辨率，这大大的影响了用户的体验效果。Currently, in order to improve the problem that there is a large time difference when video streams from different angles of the picture are shot by multiple cameras at the same time when they arrive at the media server, the existing technology compresses the video streams from each camera and then splices them into an ultra-high-resolution large-scale image. image, and then perform image correction, which requires relatively high network bandwidth for users, and as the number of cameras increases, in order to adapt to the resolution of the user's player, the resolution of the camera needs to be reduced, which greatly affects User experience effect.

基于此，本申请提供了一种媒体信息处理方法及其装置、计算机存储介质及计算机程序产品。其中一个实施例的媒体信息处理方法，包括：接收多个媒体信息流，其中，媒体信息流包括多个媒体信息包；获取接收到的目标媒体信息包的第一显示时间戳，其中，目标媒体信息包为所有媒体信息包中第一个被接收的媒体信息包；将第一显示时间戳作为各个媒体信息流的起始显示时间戳；根据起始显示时间戳对各个媒体信息流进行信息分片，得到各个媒体信息流的多个媒体分片信息，其中，媒体分片信息对应有分片序号，具有相同的分片序号的所有媒体分片信息具有相同的媒体时长；将所有媒体信息流中的目标媒体分片信息进行聚合，得到自由视点媒体分片信息，其中，目标媒体分片信息为具有相同的分片序号的媒体分片信息。该实施例中，通过将获取到的目标媒体信息包的第一显示时间戳统一设置为各个媒体信息流的起始显示时间戳，以解决各个媒体信息流同一时刻画面到达媒体服务器不一致的缺陷，进而在这种情况下根据起始显示时间戳对各个媒体信息流进行信息分片得到多个媒体分片信息，将所有媒体信息流中的具有相同的分片序号的媒体分片信息进行聚合，以得到完整的自由视点媒体分片信息，从而在保证画质的同时，避免视频画面在用户进行视角切换的过程中出现大范围地空间跳跃；因此，本申请实施例使得用户能够实现自由视点间的无缝切换，提升用户的视频体验效果，从而可以弥补相关方法中的技术空白。Based on this, this application provides a media information processing method and device, a computer storage medium and a computer program product. The media information processing method of one embodiment includes: receiving multiple media information streams, wherein the media information streams include multiple media information packets; obtaining the first display timestamp of the received target media information packet, wherein the target media The information packet is the first received media information packet among all media information packets; the first display timestamp is used as the starting display timestamp of each media information stream; information is divided into each media information flow according to the starting display timestamp. slices to obtain multiple media fragmentation information for each media information stream. Among them, the media fragmentation information corresponds to a fragmentation sequence number. All media fragmentation information with the same fragmentation sequence number has the same media duration; all media information streams are The target media fragmentation information in is aggregated to obtain free view media fragmentation information, where the target media fragmentation information is media fragmentation information with the same fragmentation sequence number. In this embodiment, by uniformly setting the first display timestamp of the acquired target media information package as the starting display timestamp of each media information stream, the defect of inconsistent images arriving at the media server at the same time for each media information stream is solved. Furthermore, in this case, information fragmentation is performed on each media information stream according to the start display timestamp to obtain multiple media fragmentation information, and the media fragmentation information with the same fragmentation sequence number in all media information streams is aggregated, In order to obtain complete free-viewpoint media fragmentation information, thereby ensuring the image quality while avoiding large-scale spatial jumps in the video picture during the user's viewing angle switching process; therefore, the embodiment of the present application enables the user to achieve free-viewpoint switching. Seamless switching improves the user's video experience, thereby filling the technical gaps in related methods.

下面结合附图，对本申请实施例作进一步阐述。The embodiments of the present application will be further described below with reference to the accompanying drawings.

如图1所示，图1是本申请一个实施例提供的媒体信息处理方法的流程图，该媒体信息处理方法可以包括但不限于步骤S110至步骤S150。As shown in Figure 1, Figure 1 is a flow chart of a media information processing method provided by an embodiment of the present application. The media information processing method may include but is not limited to step S110 to step S150.

步骤S110：接收多个媒体信息流，其中，媒体信息流包括多个媒体信息包。Step S110: Receive multiple media information streams, where the media information streams include multiple media information packets.

本步骤中，通过接收多个媒体信息流，以便于在后续步骤中对所接收到的多个媒体信息流进行彼此间的准确区分，从而确定哪个媒体信息流中的媒体信息包为第一个被接收的媒体信息包。In this step, multiple media information streams are received so that the multiple received media information streams can be accurately distinguished from each other in subsequent steps, thereby determining which media information packet in the media information stream is the first. The received media packet.

在一实施例中，步骤S110至步骤S150及其相关步骤的执行主体可以由本领域技术人员根据具体情况进行选择设置，此处并未限定。例如，以用于统筹管理各个媒体信息流的媒体服务器作为执行主体，也就是说，通过媒体服务器接收多个媒体信息流，并基于多个媒体信息流执行以下步骤S120至S150及其相关步骤，在媒体服务器中可以设置相应的功能模块以执行对应的步骤，以达到更好的统筹效果，因此可以在媒体服务器中设置收流模块，用于从自由视点前端各个机位上拉取媒体信息流并添加到收流模块中的收流缓存队列中；又如，设置其他的服务器、节点、模块或设备等作为统筹管理媒体服务器的一方，即通过统筹管理媒体服务器以间接地对多个媒体信息流进行处理，那么对应的服务器、节点、模块或设备也可以作为步骤S110至步骤S150及其相关步骤的执行主体。需要说明的是，本申请以下各实施例中主要以“媒体服务器”作为步骤S110至步骤S150及其相关步骤的执行主体进行描述，但并不作为唯一限制。In an embodiment, the execution subjects of steps S110 to S150 and related steps can be selected and set by those skilled in the art according to specific circumstances, and are not limited here. For example, a media server for overall management of each media information stream is used as the execution subject, that is, multiple media information streams are received through the media server, and the following steps S120 to S150 and related steps are executed based on the multiple media information streams, Corresponding functional modules can be set up in the media server to perform corresponding steps to achieve better overall planning effects. Therefore, a streaming module can be set up in the media server to pull media information streams from each camera on the free viewpoint front-end. And added to the flow collection cache queue in the flow collection module; for another example, setting up other servers, nodes, modules or devices as the party that coordinates and manages the media server, that is, by coordinating and managing the media server to indirectly process multiple media information If the stream is processed, the corresponding server, node, module or device can also serve as the execution subject of steps S110 to S150 and related steps. It should be noted that in the following embodiments of the present application, the "media server" is mainly used as the execution subject of steps S110 to S150 and related steps, but this is not the only limitation.

在一实施例中，媒体服务器是下一代网络的重要设备，该设备在控制设备(例如软交换设备、应用服务器等)的控制下，提供在IP网络上实现各种业务所需的媒体资源功能，包括业务音提供、会议、交互式应答、通知、统一消息、高级语音业务等。在应用服务器里，可以但不限于使用MSML(Media Server Markup Language,媒体服务器标记语言)向媒体服务器发送放音等命令。媒体服务器具有较好的可裁剪性，可灵活实现一种或多种功能，包括但不限于有:In one embodiment, the media server is an important device of the next generation network. Under the control of the control device (such as softswitch device, application server, etc.), the media server provides the media resource functions required to implement various services on the IP network. , including service tone provision, conferencing, interactive response, notification, unified messaging, advanced voice services, etc. In the application server, MSML (Media Server Markup Language) can be used, but is not limited to, to send playback and other commands to the media server. The media server has good tailorability and can flexibly implement one or more functions, including but not limited to:

双音多频(Dual-Tone Multi Frequency,DTMF)信号的采集与解码功能：按照控制设备发来的相关操作参数的规定，从DTMF话机上接收DTMF信号，封装在信令中传给控制设备；Dual-Tone Multi Frequency (DTMF) signal collection and decoding function: According to the relevant operating parameters sent by the control device, the DTMF signal is received from the DTMF phone, encapsulated in signaling and transmitted to the control device;

录音通知的发送功能：按照控制设备的要求，用规定的语音向用户播放规定的录音通知；Sending function of recording notification: according to the requirements of the control device, use the specified voice to play the specified recording notification to the user;

会议功能：支持多个RTP流的音频混合功能，支持不同编码格式的混音；Conference function: supports the audio mixing function of multiple RTP streams and supports mixing of different encoding formats;

不同编解码算法间的转换功能：支持G.711、G.723、G.729等多种语音编解码算法，并可实现编解码算法之间的转换；Conversion function between different codec algorithms: supports G.711, G.723, G.729 and other speech codec algorithms, and can realize conversion between codec algorithms;

自动语音合成功能：将若干个语音元素或字段级连起来构成一条完整的语音提示通知，该语音提示通知为固定的或可变的；Automatic speech synthesis function: concatenate several speech elements or fields to form a complete voice prompt notification, which is fixed or variable;

动态语音播放/录制功能：包括音乐保持、Follow-me语音服务等；Dynamic voice playback/recording function: including music hold, Follow-me voice service, etc.;

音信号的产生与发送功能：提供拨号音、忙音、回铃音、等待音和空号音等基本信号音；Tone signal generation and sending function: Provides basic signal tones such as dial tone, busy tone, ringback tone, waiting tone and empty number tone;

资源的维护与管理功能：以本地或/和远程两种方式，提供对媒体资源以及设备本身的维护、管理，如数据配置、故障管理等。Resource maintenance and management functions: Provide maintenance and management of media resources and the device itself in local or/and remote ways, such as data configuration, fault management, etc.

媒体服务器至少具有如下特性中的一种：A media server has at least one of the following features:

先进性：可以采用ITU-T的H.248和SIP标准协议等；Advancedness: Can adopt ITU-T’s H.248 and SIP standard protocols, etc.;

兼容性：能够方便的在不同厂家的软交换系统完成互通；Compatibility: Interoperability can be easily accomplished on softswitch systems of different manufacturers;

高可靠性：网关提供双电源，支持热插拔；定位于电信级设备，系统拥塞保护；High reliability: The gateway provides dual power supplies and supports hot swapping; it is positioned as carrier-grade equipment and has system congestion protection;

易维护性：支持与SNMP网管进行通信，能够在线维护系统、管理资源、事后分析等；Easy maintenance: Supports communication with SNMP network management, capable of online system maintenance, resource management, post-event analysis, etc.;

高扩展性和易升级性：独立的应用层可以为用户定制各种增值服务，并且能够对系统进行在线更新，最大限度的满足用户的需要；High scalability and easy upgradeability: The independent application layer can customize various value-added services for users, and can update the system online to meet user needs to the greatest extent;

灵活性：灵活的组网方式和强大的综合接入能力，可以为用户提供多种解决方案。Flexibility: Flexible networking methods and powerful comprehensive access capabilities can provide users with a variety of solutions.

在一实施例中，对于每个机位的媒体信息流的接收情况不作限定，也就是说，对于不同机位的媒体信息流的接收方式可以为相同的，也可以根据具体设置情况选择对应的方式进行接收，例如以实时消息传输协议(Real Time Messaging Protocol，RTMP)的方式拉取场景中所选定的机位的媒体信息流，也就是说，本申请实施例中确保能够接收多个媒体信息流即可，具体接收方式此处并未限定，由于不需要限制媒体信息流的传输方式，所以同样适用于以其他方式拉取媒体信息流的应用场景。In one embodiment, there is no limit to the reception of the media information stream of each camera. That is to say, the reception method of the media information stream of different cameras can be the same, or the corresponding method can be selected according to the specific settings. to receive, for example, using Real Time Messaging Protocol (RTMP) to pull the media information stream of the selected camera in the scene. That is to say, in the embodiment of the present application, it is ensured that multiple media can be received The information flow is sufficient. The specific receiving method is not limited here. Since there is no need to limit the transmission method of the media information flow, it is also suitable for application scenarios in which media information flows are pulled in other ways.

在一实施例中，媒体信息流和每个媒体信息流中的媒体信息包的接收时机、个数可以不作限定，而是在具体场景中进行相应设置。例如，在一场馆中通常可以设置超过50个的机位，对应有超过50个的媒体信息流待接收，由于用户需要在特定时间才会进入场馆中进行观看，因此可以将所选定的媒体信息流的发送时间或者播放时间设置在该特定时间附近，以便于用户能够在特定时间观看视频。In one embodiment, the media information stream and the reception timing and number of media information packets in each media information stream may not be limited, but may be set accordingly in specific scenarios. For example, more than 50 camera seats can usually be set up in a venue, corresponding to more than 50 media information streams to be received. Since users need to enter the venue to watch at a specific time, the selected media can be The sending time or playback time of the information flow is set near the specific time so that the user can watch the video at the specific time.

步骤S120：获取接收到的目标媒体信息包的第一显示时间戳，其中，目标媒体信息包为所有媒体信息包中第一个被接收的媒体信息包。Step S120: Obtain the first display timestamp of the received target media information packet, where the target media information packet is the first received media information packet among all media information packets.

本步骤中，由于需要解决各个媒体信息流同一时刻画面到达媒体服务器不一致的缺陷，也就是说，对于所有媒体信息流而言，无论到达媒体服务器的先后顺序如何，都需要对所有媒体信息流进行同步，那么为了避免出现媒体信息流的遗漏或者不匹配，至少需要找出第一个被接收的媒体信息包作为起始点进行改善，所以通过从所有媒体信息包中找出第一个被接收的媒体信息包，并以之作为目标媒体信息包，获取该目标媒体信息包的第一显示时间戳，以便于在后续步骤中将所有媒体信息包的显示时间戳与目标媒体信息包的第一显示时间戳进行对齐，以解决各个媒体信息流同一时刻到达媒体服务器不一致的缺陷。In this step, it is necessary to solve the problem that the pictures of each media information stream arrive at the media server at the same time are inconsistent. That is to say, for all media information streams, regardless of the order in which they arrive at the media server, all media information streams need to be processed. Synchronization, then in order to avoid omission or mismatch of media information flow, at least the first received media information packet needs to be found as the starting point for improvement, so by finding the first received media information packet from all media information packets media information package, and use it as the target media information package to obtain the first display timestamp of the target media information package, so that in subsequent steps, the display timestamps of all media information packages can be combined with the first display timestamp of the target media information package. The timestamps are aligned to solve the problem that each media information stream arrives inconsistently at the media server at the same time.

在一实施例中，获取接收到的目标媒体信息包的第一显示时间戳的方式可以为多种，此处并未限定。例如，将所有媒体信息包的显示时间戳进行汇总，然后对所有的显示时间戳进行比较，从而从中获取到目标媒体信息包的第一显示时间戳。In one embodiment, there may be multiple ways of obtaining the first display timestamp of the received target media information packet, which are not limited here. For example, the display timestamps of all media information packages are summarized, and then all the display timestamps are compared to obtain the first display timestamp of the target media information package.

步骤S130：将第一显示时间戳作为各个媒体信息流的起始显示时间戳。Step S130: Use the first display timestamp as the starting display timestamp of each media information stream.

本步骤中，通过将第一显示时间戳作为各个媒体信息流的起始显示时间戳，使得各个媒体信息流的显示时间戳能够被同步为起始显示时间戳，那么所有媒体信息流的显示时间戳均会保持一致，从而能够解决各个媒体信息流同一时刻到达媒体服务器不一致的缺陷，以便于在后续步骤中根据起始显示时间戳对各个媒体信息流进行信息分片、聚合。In this step, by using the first display timestamp as the starting display timestamp of each media information stream, the display timestamps of each media information stream can be synchronized as the starting display timestamp, then the display times of all media information streams The stamps will be consistent, which can solve the problem of inconsistent arrival of each media information stream at the media server at the same time, so that in subsequent steps, the information of each media information stream can be fragmented and aggregated based on the starting display timestamp.

以下给出一种具体示例以说明上述各实施例的工作原理及流程。A specific example is given below to illustrate the working principles and processes of each of the above embodiments.

示例一：Example one:

如图2a和图2b所示，图2a是本申请一个实施例提供的多个媒体信息流在进行对齐前的示意图，图2b是本申请一个实施例提供的多个媒体信息流在进行对齐后的示意图，作为示例给出了3个机位分别对应的媒体信息流的示意图，每个机位中的媒体信息流包括多个重复的分片。As shown in Figure 2a and Figure 2b, Figure 2a is a schematic diagram of multiple media information streams provided by an embodiment of the present application before alignment, and Figure 2b is a schematic diagram of multiple media information streams provided by an embodiment of the present application after alignment. As an example, a schematic diagram of the media information flows corresponding to three camera positions is given. The media information flow in each camera position includes multiple repeated fragments.

以媒体服务器为例，在将所有媒体信息包收入到收流缓存队列中的情况下，遍历收流缓存队列中的各个媒体信息包，判断当前媒体信息包是否为收到的第一个媒体信息包，若是则强制设置所有机位的第一个分片的startpts，其中，startpts是当前分片的第一个显示时间戳(Presentation Time Stamp，PTS)，即接收到的当前分片的第一个媒体信息包(即当前媒体信息包)的PTS，否则将该媒体信息包存储到对应机位的链表中，然后可以再针对另一个媒体信息包重复上述判断流程，直至找到所需求的第一个媒体信息包。Taking the media server as an example, when all media information packets are received into the streaming cache queue, each media information packet in the streaming cache queue is traversed to determine whether the current media information packet is the first media information received. package, if so, the startpts of the first fragment of all locations are forcibly set, where startpts is the first presentation time stamp (Presentation Time Stamp, PTS) of the current fragment, that is, the first received first fragment of the current fragment. PTS of a media information packet (i.e., the current media information packet), otherwise the media information packet will be stored in the linked list of the corresponding camera position, and then the above judgment process can be repeated for another media information packet until the required first one is found. media information package.

如图2a所示，给出了在未修改起始显示时间戳的情况下的各个机位的媒体信息流的示意图，方框中的数字表示当前媒体信息包的PTS，从中可以看出，分片时长为6s，机位1的第一个分片PTS范围为[0～540000)，startpts为0，机位2的第一个分片PTS范围为[7200～547200)，startpts为7200，机位3的第一个分片PTS范围为[3600～543600)，startpts为3600；由于各个机位起始分片的PTS范围不一致，因此终端在进行机位间切换时会出现画面的大范围空间跳跃的问题。As shown in Figure 2a, a schematic diagram of the media information flow of each camera position is given without modifying the start display timestamp. The numbers in the box represent the PTS of the current media information package. It can be seen from this that the The slice duration is 6s. The PTS range of the first slice of camera position 1 is [0~540000) and startpts is 0. The PTS range of the first slice of camera position 2 is [7200~547200). The startpts is 7200. The PTS range of the first slice of position 3 is [3600~543600), and the startpts is 3600. Since the PTS range of the starting slices of each camera is inconsistent, a large range of the screen will appear when the terminal switches between cameras. Jumping problem.

如图2b所示，给出了在修改起始显示时间戳的情况下的各个机位的媒体信息流的示意图，以首个收到机位2的媒体信息包(即收流缓存队列中的第一个媒体信息包为机位2的媒体信息包)为例进行说明，分片时长为6s，从中可以看出，相比于机位原来的媒体信息流，机位1的第一个分片PTS范围为[0～547200)，startpts为7200，机位2的第一个分片PTS范围为[7200～547200)，startpts为7200，机位3的第一个分片PTS范围为[3600～547200)，startpts为7200，这样各个机位的第二个分片的startpts则均为547200，也就是说，由于各个机位的第二个分片的startpts为相同的，且分片时长也是相同的，那么从各个机位的第二个分片开始，各个机位后续的各个分片可以保证为分别对应对齐的，那么其在同一时刻到达媒体服务器则是一致的，因此可以解决各个媒体信息流同一时刻到达媒体服务器不一致的缺陷。As shown in Figure 2b, a schematic diagram of the media information flow of each camera is given when the start display timestamp is modified. The first media information packet received from camera 2 (i.e., the media information packet in the stream buffer queue The first media information packet is the media information packet of camera 2) as an example. The fragmentation duration is 6s. It can be seen from this that compared with the original media information stream of camera 1, the first fragment of camera 1 The slice PTS range is [0~547200), startpts is 7200, the first slice PTS range of camera position 2 is [7200~547200), startpts is 7200, and the first slice PTS range of camera position 3 is [3600 ~547200), the startpts is 7200, so the startpts of the second slice of each camera position are all 547200, that is to say, because the startpts of the second slice of each camera position are the same, and the slice duration is also The same, then starting from the second fragment of each camera position, the subsequent fragments of each camera position can be guaranteed to be correspondingly aligned, and then they will be consistent when they arrive at the media server at the same time, so each media can be solved The defect of inconsistent information flow arriving at the media server at the same time.

步骤S140：根据起始显示时间戳对各个媒体信息流进行信息分片，得到各个媒体信息流的多个媒体分片信息，其中，媒体分片信息对应有分片序号，具有相同的分片序号的所有媒体分片信息具有相同的媒体时长；Step S140: Perform information fragmentation on each media information stream according to the start display timestamp to obtain multiple media fragmentation information for each media information stream. The media fragmentation information corresponds to a fragmentation sequence number and has the same fragmentation sequence number. All media fragment information have the same media duration;

本步骤中，由于在步骤S130中已经确定了各个媒体信息流的起始显示时间戳，因此可以进一步根据起始显示时间戳对各个媒体信息流进行信息分片，得到各个媒体信息流的多个媒体分片信息，并且通过分片序号对各个媒体分片信息进行区分，其中具有相同的分片序号的所有媒体分片信息具有相同的媒体时长，因此对于不同的媒体信息流而言，通过比较各自的分片序号即可以确认得到同一时间段的媒体分片信息，以便于在后续步骤中对将同一时间段的各个媒体分片信息聚合为一个完整的自由视点分片。In this step, since the start display timestamp of each media information stream has been determined in step S130, each media information stream can be further fragmented according to the start display timestamp to obtain multiple Media fragmentation information, and each media fragmentation information is distinguished by fragmentation sequence number. All media fragmentation information with the same fragmentation sequence number has the same media duration. Therefore, for different media information flows, by comparing The respective fragment serial numbers can confirm the media fragmentation information of the same time period, so that in subsequent steps, the various media fragmentation information of the same time period can be aggregated into a complete free-viewpoint fragmentation.

如图3所示，本申请的一个实施例，对步骤S140进行进一步说明，步骤S140包括但不限于步骤S141和S142。As shown in Figure 3, an embodiment of the present application further explains step S140. Step S140 includes but is not limited to steps S141 and S142.

步骤S141：对于各个媒体信息流，获取当前接收到的媒体信息包的第二显示时间戳；Step S141: For each media information stream, obtain the second display timestamp of the currently received media information packet;

步骤S142：当根据第二显示时间戳和起始显示时间戳确定满足信息分片条件，根据当前接收到的媒体信息包进行初始信息分片，将第二显示时间戳作为新的起始显示时间戳，根据新的起始显示时间戳进行后续信息分片。Step S142: When it is determined that the information fragmentation conditions are met based on the second display timestamp and the starting display timestamp, initial information fragmentation is performed based on the currently received media information packet, and the second display timestamp is used as the new starting display time. stamp, and perform subsequent information fragmentation based on the new starting display timestamp.

本步骤中，通过获取当前接收到的媒体信息包的第二显示时间戳，以便于将第二显示时间戳与对齐的起始显示时间戳进行比较以确定是否满足信息分片条件，若满足则可以根据当前接收到的媒体信息包进行初始信息分片，同时以符合条件的第二显示时间戳作为新的起始显示时间戳进行后续信息分片，从而能够得到当前接收到的媒体信息包的完整的各个媒体分片信息，以便于在后续步骤中对将同一时间段的各个媒体分片信息聚合为一个完整的自由视点分片。In this step, the second display timestamp of the currently received media information packet is obtained to compare the second display timestamp with the aligned starting display timestamp to determine whether the information fragmentation condition is met. If so, Initial information fragmentation can be performed based on the currently received media information packet, and subsequent information fragmentation can be performed using the second display timestamp that meets the conditions as a new starting display timestamp, thereby obtaining the information of the currently received media information packet. Complete information about each media fragment, so that in subsequent steps, the information about each media fragment in the same time period can be aggregated into a complete free-viewpoint fragment.

在一实施例中，信息分片条件可以根据具体场景进行相应设置，此处并未作出限定。例如，信息分片条件可以包括但不限于为：第二显示时间戳与起始显示时间戳之差和预设时间基准的比值，大于或等于预设分片时长，其中，预设时间基准可以但不限于为对应的媒体信息流的时间基准，当所有媒体信息包的时长相同，则可以但不限于将媒体信息包的时长设置为预设分片时长，两个显示时间戳的差值用于衡量第二显示时间戳与起始显示时间戳之间的差异程度，也就是说，第二显示时间戳足够大以进一步实现后续信息分片，那么当第二显示时间戳与起始显示时间戳之差和预设时间基准的比值小于预设分片时长时，则可以确定不需要对当前接收到的媒体信息包进行信息分片。In one embodiment, the information fragmentation conditions can be set accordingly according to specific scenarios, which are not limited here. For example, the information fragmentation conditions may include but are not limited to: the ratio of the difference between the second display timestamp and the initial display timestamp and the preset time reference, which is greater than or equal to the preset fragmentation duration, wherein the preset time reference may be But it is not limited to the time base of the corresponding media information flow. When all media information packets have the same duration, the duration of the media information packets can be, but is not limited to, set to the preset fragmentation duration. The difference between the two display timestamps is used To measure the difference between the second display timestamp and the initial display timestamp, that is to say, the second display timestamp is large enough to further implement subsequent information fragmentation, then when the second display timestamp is different from the initial display time When the ratio between the stamp difference and the preset time base is less than the preset fragmentation duration, it can be determined that the currently received media information packet does not need to be fragmented.

在一实施例中，获取当前接收到的媒体信息包的第二显示时间戳的方式可以为多种，此处并未限定。例如，将所有媒体信息包的显示时间戳进行汇总，然后对所有的显示时间戳进行比较，从而从中获取到当前接收到的媒体信息包的第二显示时间戳。In one embodiment, there may be multiple ways of obtaining the second display timestamp of the currently received media information packet, which are not limited here. For example, the display timestamps of all media information packages are summarized, and then all the display timestamps are compared to obtain the second display timestamp of the currently received media information package.

在一实施例中，在根据新的起始显示时间戳进行后续信息分片之后，可以依照步骤S142的方式继续进行接下来的信息分片，也就是说，在清楚后续信息分片的时长的情况下，可以根据该信息分片的时长、上一个起始显示时间戳以及预设时间基准确定下一个起始显示时间戳，从而能够基于下一个起始显示时间戳进行更后续的信息分片。In one embodiment, after the subsequent information fragmentation is performed according to the new starting display timestamp, the next information fragmentation can be continued according to step S142. That is to say, after the duration of the subsequent information fragmentation is clear, In this case, the next starting display timestamp can be determined based on the duration of the information fragmentation, the previous starting display timestamp and the preset time base, so that subsequent information fragmentation can be performed based on the next starting display timestamp. .

如图4所示，本申请的一个实施例，对步骤S141至S142之前的步骤进行进一步说明，还包括但不限于步骤S160至S180。As shown in Figure 4, one embodiment of the present application further explains the steps before steps S141 to S142, including but not limited to steps S160 to S180.

步骤S160：检测是否存在第一目标媒体信息流，其中，第一目标媒体信息流为满足断流恢复条件的媒体信息流；Step S160: Detect whether there is a first target media information flow, where the first target media information flow is a media information flow that satisfies the interruption recovery condition;

步骤S170：当检测到存在第一目标媒体信息流，获取第一目标媒体信息流所对应的第二显示时间戳与多个第二目标媒体信息流所对应的起始显示时间戳之间的差值，其中，第二目标媒体信息流为不满足断流恢复条件的媒体信息流；Step S170: When detecting the presence of the first target media information stream, obtain the difference between the second display timestamp corresponding to the first target media information stream and the starting display timestamps corresponding to the multiple second target media information streams. value, wherein the second target media information flow is a media information flow that does not meet the interruption recovery condition;

步骤S180：将第一目标媒体信息流的起始显示时间戳和分片序号，更新为目标差值所对应的第二目标媒体信息流的起始显示时间戳和分片序号，其中，目标差值为所有差值中数值最小的一个。Step S180: Update the start display timestamp and fragment sequence number of the first target media information stream to the start display timestamp and fragment sequence number of the second target media information stream corresponding to the target difference value, where the target difference The value is the smallest of all differences.

本步骤中，由于断流恢复会影响到对媒体信息包进行的后续信息分片，因此在步骤S160中通过检测是否存在满足断流恢复条件的第一目标媒体信息流以进一步判断断流恢复情况，并且当检测到存在第一目标媒体信息流，获取第一目标媒体信息流所对应的第二显示时间戳与多个第二目标媒体信息流所对应的起始显示时间戳之间的差值，即考虑满足断流恢复条件的第一目标媒体信息流与所有不满足断流恢复条件的第二目标媒体信息流之间的显示时间戳的差异，从所有第二目标媒体信息流中选择目标差值所对应的第二目标媒体信息流的起始显示时间戳和分片序号，作为更新第一目标媒体信息流的起始显示时间戳和分片序号的依据，而由于目标差值为所有差值中数值最小的一个，因此可以将第一目标媒体信息流的起始显示时间戳和分片序号更新为最近邻的媒体信息流的起始显示时间戳和分片序号，这样可以降低后续进行信息分片的难度，即尽量进行更少次数的信息分片，可以降低网络带宽要求。In this step, since the interruption recovery will affect the subsequent information fragmentation of the media information packet, in step S160, the status of the interruption recovery is further determined by detecting whether there is a first target media information flow that satisfies the interruption recovery condition. , and when detecting the presence of the first target media information stream, obtain the difference between the second display timestamp corresponding to the first target media information stream and the starting display timestamps corresponding to the multiple second target media information streams. , that is, considering the difference in display timestamp between the first target media information flow that meets the interruption recovery conditions and all the second target media information flows that do not meet the interruption recovery conditions, select the target from all the second target media information flows The starting display timestamp and fragmentation sequence number of the second target media information stream corresponding to the difference are used as the basis for updating the starting display timestamp and fragmentation sequence number of the first target media information stream. Since the target difference value is all The smallest value among the differences, so the starting display timestamp and fragmentation sequence number of the first target media information stream can be updated to the starting display timestamp and fragmentation sequence number of the nearest neighbor media information stream, which can reduce the subsequent The difficulty of information fragmentation, that is, trying to fragment information as few times as possible, can reduce network bandwidth requirements.

在一实施例中，断流恢复条件可以根据具体场景进行相应设置，此处并未作出限定。例如，断流恢复条件可以但不限于包括：第二显示时间戳与上一个接收到的媒体信息包的显示时间戳之差和预设时间基准的比值，大于预设超时时长，其中，预设时间基准可以但不限于为对应的媒体信息流的时间基准，通过将第二显示时间戳与上一个接收到的媒体信息包的显示时间戳之差进行比较，可以衡量当前接收到的媒体信息包的第二显示时间戳与最近接收到的媒体信息包的显示时间戳之间的差异，以便于较好地确定第二显示时间戳的实际超时程度，可以理解地是，当第二显示时间戳与上一个接收到的媒体信息包的显示时间戳之差和预设时间基准的比值小于或等于预设超时时长时，则可以确定不需要对当前接收到的媒体信息包进行断流恢复。In one embodiment, the current interruption recovery conditions can be set accordingly according to specific scenarios, which are not limited here. For example, the interruption recovery condition may include, but is not limited to: the ratio of the difference between the second display timestamp and the display timestamp of the last received media information packet and the preset time base is greater than the preset timeout period, where the preset The time base may be, but is not limited to, the time base of the corresponding media information flow. By comparing the difference between the second display timestamp and the display timestamp of the last received media information packet, the currently received media information packet can be measured. The difference between the second display timestamp and the display timestamp of the most recently received media information packet, in order to better determine the actual timeout degree of the second display timestamp, it can be understood that when the second display timestamp When the difference between the display timestamp of the last received media information packet and the ratio of the preset time base is less than or equal to the preset timeout, it can be determined that there is no need to perform interruption recovery on the currently received media information packet.

步骤S150：将所有媒体信息流中的目标媒体分片信息进行聚合，得到自由视点媒体分片信息，其中，目标媒体分片信息为具有相同的分片序号的媒体分片信息。Step S150: Aggregate the target media fragmentation information in all media information streams to obtain free-view media fragmentation information, where the target media fragmentation information is media fragmentation information with the same fragmentation sequence number.

本步骤中，通过将获取到的目标媒体信息包的第一显示时间戳统一设置为各个媒体信息流的起始显示时间戳，以解决各个媒体信息流同一时刻画面到达媒体服务器不一致的缺陷，进而在这种情况下根据起始显示时间戳对各个媒体信息流进行信息分片得到多个媒体分片信息，将所有媒体信息流中的具有相同的分片序号的媒体分片信息进行聚合，以得到完整的自由视点媒体分片信息，从而在保证画质的同时，避免视频画面在用户进行视角切换的过程中出现大范围地空间跳跃；因此，本申请实施例使得用户能够实现自由视点间的无缝切换，提升用户的视频体验效果，从而可以弥补相关方法中的技术空白。In this step, the first display timestamp of the obtained target media information package is uniformly set as the starting display timestamp of each media information stream, so as to solve the problem of inconsistent images arriving at the media server at the same time for each media information stream, and then In this case, information fragmentation is performed on each media information stream according to the starting display timestamp to obtain multiple media fragmentation information, and the media fragmentation information with the same fragmentation sequence number in all media information streams is aggregated to obtain Obtain complete free-viewpoint media fragmentation information, thereby ensuring image quality while avoiding large-scale spatial jumps in the video picture when the user switches perspectives; therefore, embodiments of the present application enable users to achieve free-viewpoint switching. Seamless switching improves the user's video experience, thereby filling the technical gaps in related methods.

在一实施例中，目标媒体分片信息可以但不限于为分片序号非1的媒体分片信息，参照图2a和图2b的示例可知，各个机位的媒体信息流中的第一个媒体分片信息的显示时间戳是被修改为第一个被接收的媒体信息包的第一显示时间戳的，在这种情况下各个机位的第一个媒体分片信息(即分片序号为1的媒体分片信息)的时长是不对应相同的，若直接对分片序号为1的媒体分片信息进行聚合则不对应，因此可以从分片序号为2的媒体分片信息开始进行聚合，以便于得到可靠稳定的自由视点媒体分片信息。In one embodiment, the target media fragmentation information may be, but is not limited to, media fragmentation information with a fragmentation sequence number other than 1. Referring to the examples of FIG. 2a and FIG. 2b , it can be seen that the first media in the media information stream of each camera position The display timestamp of the fragment information is modified to the first display timestamp of the first received media information packet. In this case, the first media fragment information of each camera position (that is, the fragment sequence number is 1) do not correspond to the same duration. If the media fragment information with fragment sequence number 1 is directly aggregated, it will not correspond. Therefore, the aggregation can start from the media fragment information with fragment sequence number 2. , in order to obtain reliable and stable free-view media fragmentation information.

在一实施例中，无需像现有技术一样将各个机位的视频流压缩后拼接成一幅超高分辨率的大图画面再进行图像校正，而是根据相应的显示时间戳对媒体信息流进行信息分片并基于媒体信息流中的目标媒体分片信息进行聚合，得到最终的自由视点媒体分片信息，因此能够大大地降低对于网络带宽的要求，对于用户而言更加适用，而且采用本申请实施例的媒体分片信息拼接方式，也不需要考虑各个机位的分辨率的实际影响，即不需要通过适配用户播放的分辨率而降低各个机位的分辨率，因此能够进一步地提升用户的体验效果。In one embodiment, there is no need to compress the video streams of each camera and splice them into a large ultra-high-resolution picture and then perform image correction like the existing technology. Instead, the media information stream is processed according to the corresponding display timestamp. The information is fragmented and aggregated based on the target media fragmentation information in the media information stream to obtain the final free-view media fragmentation information. Therefore, it can greatly reduce the requirements for network bandwidth and is more suitable for users. Moreover, this application The media fragmentation information splicing method of the embodiment does not need to consider the actual impact of the resolution of each camera, that is, it does not need to reduce the resolution of each camera by adapting the resolution played by the user, so it can further improve the user experience. experience effect.

如图5所示，本申请的一个实施例，对步骤S150进行进一步说明，步骤S150包括但不限于步骤S151至S153。As shown in Figure 5, an embodiment of the present application further explains step S150, which includes but is not limited to steps S151 to S153.

步骤S151：依次遍历各个媒体信息流中的目标媒体分片信息；Step S151: Traverse the target media fragmentation information in each media information stream in sequence;

步骤S152:判断当前目标媒体分片信息是否为断流恢复后的第一个媒体分片信息；Step S152: Determine whether the current target media fragmentation information is the first media fragmentation information after the interruption is restored;

步骤S153：若当前目标媒体分片信息不为断流恢复后的第一个媒体分片信息，对当前目标媒体分片信息进行聚合。Step S153: If the current target media fragmentation information is not the first media fragmentation information after the interruption and recovery, aggregate the current target media fragmentation information.

本步骤中，通过遍历各个媒体信息流中的目标媒体分片信息以判断当前目标媒体分片信息是否为断流恢复后的第一个媒体分片信息，按照上述实施例的相关评述可知，由于断流恢复后的第一个媒体分片信息与各个机位的第一个显示时间戳被修改的媒体分片信息相类似，也不能够较好地适用于进行聚合，因此当判断当前目标媒体分片信息不为断流恢复后的第一个媒体分片信息，才选择对当前目标媒体分片信息进行聚合，以便于得到可靠稳定的自由视点媒体分片信息，也就是说，针对断流恢复情况，至少到断流恢复后的第二个媒体分片信息起才对其进行聚合，这样得到自由视点媒体分片信息效果更好。In this step, the target media fragmentation information in each media information stream is traversed to determine whether the current target media fragmentation information is the first media fragmentation information after the interruption is restored. According to the relevant comments of the above embodiment, it can be seen that because The first media fragment information after the interruption is restored is similar to the first media fragment information of each camera whose display timestamp has been modified. It is not suitable for aggregation. Therefore, when judging the current target media Only when the fragmentation information is not the first media fragmentation information after the interruption is restored, the current target media fragmentation information is selected to be aggregated, in order to obtain reliable and stable free-view media fragmentation information. In other words, for the interruption In the recovery situation, at least the second media fragmentation information will not be aggregated until the second one after the interruption is restored, so that it is better to obtain the free-viewpoint media fragmentation information.

本申请的一个实施例，在步骤S151至S153的基础上，对步骤S150进行进一步说明，步骤S150还包括但不限于步骤S154。In one embodiment of the present application, based on steps S151 to S153, step S150 is further described. Step S150 also includes but is not limited to step S154.

步骤S154：若当前目标媒体分片信息为断流恢复后的第一个媒体分片信息，不对当前目标媒体分片信息进行聚合。Step S154: If the current target media fragmentation information is the first media fragmentation information after the interruption and recovery, the current target media fragmentation information is not aggregated.

本步骤中，由于断流恢复后的第一个媒体分片信息与各个机位的第一个显示时间戳被修改的媒体分片信息相类似，也不能够较好地适用于进行聚合，因此当判断当前目标媒体分片信息为断流恢复后的第一个媒体分片信息，则不对当前目标媒体分片信息进行聚合，以免其影响到自由视点媒体分片信息的整体聚合过程，也就是说，针对断流恢复情况，至少到断流恢复后的第二个媒体分片信息起才对其进行聚合，这样得到自由视点媒体分片信息效果更好。In this step, since the first media fragmentation information after the interruption is similar to the first media fragmentation information of each camera with a modified display timestamp, it is not suitable for aggregation. Therefore, When it is determined that the current target media fragmentation information is the first media fragmentation information after the interruption and recovery, the current target media fragmentation information will not be aggregated to avoid affecting the overall aggregation process of free view media fragmentation information, that is, It is said that for the situation of interruption recovery, it is better to aggregate the media fragmentation information at least until the second media fragmentation after the interruption recovery. In this way, it is better to obtain the free viewpoint media fragmentation information.

以下给出多种具体示例以说明上述各实施例的工作原理及流程。Various specific examples are given below to illustrate the working principles and processes of each of the above embodiments.

示例二：Example two:

如图6所示，图6为本申请一个实施例提供的用于执行媒体信息处理方法的媒体服务器的示意图。As shown in Figure 6, Figure 6 is a schematic diagram of a media server for executing a media information processing method provided by an embodiment of the present application.

参照图6，媒体服务器可以但不限于包括收流模块、对齐模块和拼接模块，其中：Referring to Figure 6, the media server may, but is not limited to, include a flow collection module, an alignment module and a splicing module, where:

收流模块用于从自由视点前端各个机位拉取媒体流(即图6所示的机位1媒体流、机位2媒体流、机位3媒体流…机位n媒体流)，添加到收流缓存队列中；The stream collection module is used to pull media streams from each camera on the free viewpoint front-end (i.e. camera 1 media stream, camera 2 media stream, camera 3 media stream... camera n media stream as shown in Figure 6), and add it to In the receive buffer queue;

对齐模块，用于取出收流缓存队列中的媒体流，并对其进行对齐处理后再分片；The alignment module is used to take out the media streams in the stream collection cache queue, align them and then fragment them;

拼接模块，用于将各个机位按照相同的分片序号聚合成一个完整的自由视点分片。The splicing module is used to aggregate each camera position into a complete free-viewpoint slice according to the same slice sequence number.

根据上述示例可知，通过收流模块、对齐模块和拼接模块的配合，使得用户能够实现自由视点间的无缝切换，提升用户的视频体验效果，从而可以弥补相关方法中的技术空白。According to the above examples, it can be seen that through the cooperation of the flow collection module, the alignment module and the splicing module, the user can achieve seamless switching between free viewpoints, improve the user's video experience, and thus make up for the technical gaps in related methods.

示例三：Example three:

以下具体针对示例二中的对齐模块的工作原理及流程进行详细说明。The following is a detailed description of the working principle and process of the alignment module in Example 2.

如图7所示，图7为本申请一个实施例提供的对齐模块执行媒体信息处理方法的流程图。As shown in Figure 7, Figure 7 is a flow chart of a media information processing method performed by an alignment module provided by an embodiment of the present application.

参照图7，对齐模块可以但不限于执行如下步骤：Referring to Figure 7, the alignment module may, but is not limited to, perform the following steps:

步骤a：遍历收流缓存队列中的各个媒体信息包，判断当前媒体信息包是否为收到的第一个媒体信息包，若是则强制设置所有机位的第一个分片的startpts为接收到的第一个媒体信息包(即当前媒体信息包)的PTS之后，再进入步骤b，否则不做任何处理，直接进入步骤b。Step a: Traverse each media information packet in the streaming buffer queue and determine whether the current media information packet is the first media information packet received. If so, force the startpts of the first fragment of all cameras to be set to received. After the PTS of the first media information packet (that is, the current media information packet), enter step b, otherwise no processing is performed and step b is entered directly.

步骤b：将媒体信息包分别存储到对应机位的链表中。Step b: Store the media information packets in the linked list of the corresponding camera location.

步骤c：根据(curpts-lastpts)/timebase>overtime这一公式判断该机位是否存在断流恢复的场景，若是则进入步骤d，否则进入步骤e，其中，curpts表示该机位的当前媒体信息包的PTS，lastpts表示该机位的上个媒体包的PTS，timebase表示媒体流的时基，overtime表示预设的超时时间。Step c: According to the formula (curpts-lastpts)/timebase>overtime, determine whether there is a disconnection recovery scenario for the camera. If so, go to step d, otherwise go to step e, where curpts represents the current media information of the camera. The PTS of the package, lastpts represents the PTS of the previous media package of this camera, timebase represents the time base of the media stream, and overtime represents the preset timeout time.

步骤d：计算出curpts与其他正常机位的startpts的差值diffpts，找到最小的diffpts所对应的机位的startpts以及segno(segno是指分片序号，从1开始递增)，将其设置为断流恢复机位中的相应信息；具体地，参照图8，图8为本申请另一个实施例提供的多个媒体信息流的示意图，方框中的数字表示当前媒体信息包的PTS值，从中可以看出，机位1的startpts为0时，segno为1，机位2的startpts为540000时，segno为2，机位3的startpts为108000时，segno为3，而机位1出现断流情形，则计算其与机位2、机位3相应的媒体信息包之间的PTS差值，与机位2之间的为1083600-540000＝543600，与机位3之间的为1083600-1080000＝3600，也就是说，将机位3所对应的机位信息startpts以及segno设置为机位1中的相应信息，这样机位1在断流恢复后能够与机位3保持对齐，而当机位2的下个媒体信息包PTS＝1080000到达时，机位2切换到下一个分片，startpts以及segno也与机位3保持对齐，这样就保证了机位1断流恢复后能够与其他机位对齐。Step d: Calculate the difference diffpts between curpts and the startpts of other normal locations, find the startpts and segno of the location corresponding to the smallest diffpts (segno refers to the fragment serial number, increasing from 1), and set it to break. The corresponding information in the stream recovery position; specifically, refer to Figure 8, which is a schematic diagram of multiple media information streams provided by another embodiment of the present application. The numbers in the box represent the PTS value of the current media information packet, from which It can be seen that when the startpts of stand 1 is 0, segno is 1, when the startpts of stand 2 is 540000, segno is 2, when the startpts of stand 3 is 108000, segno is 3, and stand 1 has a flow interruption. In this case, calculate the PTS difference between the media information packets corresponding to camera position 2 and camera position 3. The difference between it and camera position 2 is 1083600-540000=543600, and the difference between it and camera position 3 is 1083600-1080000. =3600, that is to say, set the startpts and segno corresponding to the camera position 3 to the corresponding information in the camera position 1, so that the camera position 1 can maintain alignment with the camera position 3 after the interruption is restored, and the machine crashes When the next media information packet PTS = 1080000 of position 2 arrives, position 2 switches to the next fragment, and startpts and segno are also aligned with position 3. This ensures that position 1 can communicate with other machines after the interruption and recovery. Bit aligned.

步骤e：根据(curpts-startpts)/timebase>＝min_seg_duration这一公式判断该机位是否已经满足分片的条件，若是则直接分片，分片以segno命名，待segno加1后进入步骤a，否则不做任何处理直接进入步骤a，其中，min_seg_duration表示预设的分片时长。Step e: According to the formula (curpts-startpts)/timebase>=min_seg_duration, determine whether the location has met the conditions for fragmentation. If so, directly fragment the fragments. The fragments are named after segno. After segno is increased by 1, enter step a. Otherwise, go directly to step a without any processing, where min_seg_duration represents the preset segmentation duration.

示例四：Example four:

以下具体针对示例二中的拼接模块的工作原理及流程进行详细说明。The following is a detailed description of the working principle and process of the splicing module in Example 2.

如图9所示，图9为本申请一个实施例提供的拼接模块执行媒体信息处理方法的流程图。As shown in Figure 9, Figure 9 is a flow chart of a media information processing method performed by a splicing module provided by an embodiment of the present application.

参照图9，拼接模块可以但不限于执行如下步骤：Referring to Figure 9, the splicing module can, but is not limited to, perform the following steps:

步骤a：扫描分片信息，判断将要聚合的分片序号n是否为1，若否则进入步骤b；若是则将分片序号加1后再次进入步骤a，由于各个机位的第一个分片强制对齐后时长不一致，所以不对各个机位的第一个分片进行聚合操作。Step a: Scan the fragment information and determine whether the fragment serial number n to be aggregated is 1. If not, go to step b; if so, add 1 to the fragment serial number and then enter step a again. Since the first fragment of each machine position The duration is inconsistent after forced alignment, so the first fragment of each location is not aggregated.

步骤b：依次遍历各机位相同序号的分片，也即依次遍历各机位的分片序号为n的分片，判断该分片是否为断流恢复后的第一个分片，若是则再次进入步骤b，否则进入步骤c。Step b: Traverse the fragments with the same serial number of each machine in sequence, that is, traverse the fragments with the fragment number n of each machine in sequence, and determine whether the fragment is the first fragment after the interruption is restored. If so, then Go to step b again, otherwise go to step c.

步骤c：聚合该机位的分片为自由视点媒体分片信息，并判断是否扫描完所有机位分片序号为n的分片，若是则分片序号加1后进入步骤a，否则进入步骤b。Step c: Aggregate the free-view media fragmentation information of the camera's fragments, and determine whether all the fragments with fragment number n of the camera have been scanned. If so, add 1 to the fragment number and proceed to step a. Otherwise, proceed to step a. b.

结合上述各个示例可知，本申请实施例在初始化阶段强制设置所有机位起始分片的第一个PTS，在运行阶段按照分片时长进行切片并递增分片序号，当监控到机位存在断流恢复的场景时，重新计算出该机位当前分片的第一个PTS以及分片序号，然后根据分片序号将同一时间段的所有机位信息聚合为一个完整的自由视点分片，供用户选择视角进行播放，能够解决各机位码流同一时刻画面到达媒体服务端不一致的问题，避免用户在进行视角切换的过程中视频画面出现大范围空间跳跃，同时保证了画质，也能降低终端设备的带宽以及性能要求，使得用户实现自由视点间的无缝切换，提升用户的视频体验效果。Combining the above examples, it can be seen that the embodiment of the present application forcibly sets the first PTS of the starting sharding of all machine locations in the initialization phase. In the running phase, slicing is performed according to the slicing duration and the slicing serial number is incremented. When an outage is detected at the machine position, In the scenario of stream recovery, the first PTS and fragment sequence number of the current fragment of the camera are recalculated, and then all camera information in the same time period is aggregated into a complete free-viewpoint fragment based on the fragment sequence number. The user selects a viewing angle for playback, which can solve the problem of inconsistent images arriving at the media server at the same time from each camera's code stream, avoid large-scale spatial jumps in the video image when the user switches viewing angles, while ensuring image quality and reducing The bandwidth and performance requirements of the terminal device enable users to seamlessly switch between free viewpoints and improve the user's video experience.

本申请实施例的方法可以被广泛应用于VR、虚拟视点场景下等的全景视频生成。The methods of the embodiments of the present application can be widely used in panoramic video generation in VR, virtual viewpoint scenes, etc.

另外，如图10所示，本申请的一个实施例还公开了一种媒体信息处理装置100，包括：至少一个处理器110；至少一个存储器120，用于存储至少一个程序；当至少一个程序被至少一个处理器110执行时实现如前面任意实施例中的媒体信息处理方法。In addition, as shown in Figure 10, one embodiment of the present application also discloses a media information processing device 100, including: at least one processor 110; at least one memory 120, used to store at least one program; when at least one program is When executed, at least one processor 110 implements the media information processing method as in any previous embodiment.

另外，本申请的一个实施例还公开了一种计算机可读存储介质，其中存储有计算机可执行指令，计算机可执行指令用于执行如前面任意实施例中的媒体信息处理方法。In addition, an embodiment of the present application also discloses a computer-readable storage medium in which computer-executable instructions are stored, and the computer-executable instructions are used to execute the media information processing method as in any of the previous embodiments.

此外，本申请的一个实施例还公开了一种计算机程序产品，包括计算机程序或计算机指令，计算机程序或计算机指令存储在计算机可读存储介质中，计算机设备的处理器从计算机可读存储介质读取计算机程序或计算机指令，处理器执行计算机程序或计算机指令，使得计算机设备执行如前面任意实施例中的媒体信息处理方法。In addition, an embodiment of the present application also discloses a computer program product, which includes a computer program or computer instructions. The computer program or computer instructions are stored in a computer-readable storage medium. The processor of the computer device reads the computer program from the computer-readable storage medium. The computer program or computer instructions are obtained, and the processor executes the computer program or computer instructions, so that the computer device performs the media information processing method as in any of the previous embodiments.

本领域普通技术人员可以理解，上文中所公开方法中的全部或某些步骤、系统可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器，如中央处理器、数字信号处理器或微处理器执行的软件，或者被实施为硬件，或者被实施为集成电路，如专用集成电路。这样的软件可以分布在计算机可读介质上，计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的，术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外，本领域普通技术人员公知的是，通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据，并且可包括任何信息递送介质。Those of ordinary skill in the art can understand that all or some steps and systems in the methods disclosed above can be implemented as software, firmware, hardware, and appropriate combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As is known to those of ordinary skill in the art, the term computer storage media includes volatile and nonvolatile media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. removable, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer. Additionally, it is known to those of ordinary skill in the art that communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Claims

1. A media information processing method, comprising:

receiving a plurality of media information streams, wherein the media information streams comprise a plurality of media information packets;

acquiring a first display time stamp of a received target media information packet, wherein the target media information packet is a first received media information packet in all the media information packets;

taking the first display time stamp as a starting display time stamp of each media information stream;

performing information slicing on each media information stream according to the initial display time stamp to obtain a plurality of pieces of media slicing information of each media information stream, wherein the media slicing information corresponds to slicing serial numbers, and all pieces of media slicing information with the same slicing serial number have the same media duration;

and aggregating the target media slicing information in all the media information streams to obtain free viewpoint media slicing information, wherein the target media slicing information is the media slicing information with the same slicing sequence number.

2. The method according to claim 1, wherein said information slicing each of said media information streams according to said start display time stamp, comprises:

And for each media information stream, acquiring a second display time stamp of the currently received media information packet, when the condition of information slicing is determined to be met according to the second display time stamp and the initial display time stamp, performing initial information slicing according to the currently received media information packet, taking the second display time stamp as a new initial display time stamp, and performing subsequent information slicing according to the new initial display time stamp.

3. The media information processing method of claim 2, wherein the information slicing condition comprises:

and the ratio of the difference between the second display time stamp and the initial display time stamp to a preset time reference is greater than or equal to a preset slicing duration.

4. The method according to claim 2, wherein before the information slicing is performed on each of the media information streams according to the start display time stamp, the method further comprises:

detecting whether a first target media information stream exists, wherein the first target media information stream is the media information stream meeting the cut-off recovery condition;

when the first target media information stream is detected to exist, acquiring a difference value between the second display time stamp corresponding to the first target media information stream and the initial display time stamp corresponding to a plurality of second target media information streams, wherein the second target media information stream is the media information stream which does not meet the cut-off recovery condition;

And updating the initial display time stamp and the fragment sequence number of the first target media information stream into the initial display time stamp and the fragment sequence number of the second target media information stream corresponding to a target difference value, wherein the target difference value is the one with the smallest value in all the difference values.

5. The media information processing method of claim 4, wherein the break recovery condition comprises:

and the ratio of the difference between the second display time stamp and the display time stamp of the last received media information packet to a preset time reference is greater than a preset timeout period.

6. The media information processing method of claim 1, wherein the target media slice information is the media slice information with the slice sequence number not 1.

7. The method of media information processing according to claim 6, wherein aggregating target media slice information in all the media information streams comprises:

traversing the target media fragment information in each media information stream in sequence;

judging whether the current target media slicing information is the first media slicing information after the interruption recovery;

And if the current target media slicing information is not the first media slicing information after the interruption recovery, aggregating the current target media slicing information.

8. The method for processing media information according to claim 7, wherein the aggregating all the target media slice information in the media information stream further comprises:

and if the current target media slicing information is the first media slicing information after the interruption recovery, not aggregating the current target media slicing information.

9. A media information processing device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the media information processing method according to any one of claims 1 to 8 when executing the computer program.

10. A computer-readable storage medium storing computer-executable instructions for performing the media information processing method of any one of claims 1 to 8.