CN117241105A - Media information processing method and device and storage medium - Google Patents

Media information processing method and device and storage medium Download PDF

Info

Publication number
CN117241105A
CN117241105A CN202210642307.6A CN202210642307A CN117241105A CN 117241105 A CN117241105 A CN 117241105A CN 202210642307 A CN202210642307 A CN 202210642307A CN 117241105 A CN117241105 A CN 117241105A
Authority
CN
China
Prior art keywords
media
information
slicing
media information
time stamp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210642307.6A
Other languages
Chinese (zh)
Inventor
陈奇
王魏强
张晓渠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN202210642307.6A priority Critical patent/CN117241105A/en
Priority to PCT/CN2023/089286 priority patent/WO2023236666A1/en
Publication of CN117241105A publication Critical patent/CN117241105A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The application discloses a media information processing method and device and a storage medium. The media information processing method comprises the following steps: receiving a plurality of media information streams; acquiring a first display time stamp of a received target media information packet; taking the first display time stamp as a starting display time stamp of each media information stream; performing information slicing on each media information stream according to the initial display time stamp to obtain a plurality of media slicing information, wherein the media slicing information corresponds to slicing serial numbers, and all the media slicing information with the same slicing serial number have the same media duration; and aggregating all the target media slicing information to obtain free viewpoint media slicing information, wherein the target media slicing information is the media slicing information with the same slicing sequence number. In the embodiment of the application, the user can realize seamless switching between free viewpoints, and the video experience effect of the user is improved, so that the technical blank in the related method can be made up.

Description

Media information processing method and device and storage medium
Technical Field
The application relates to the technical field of videos, in particular to a media information processing method and device and a computer storage medium.
Background
With the rapid development of 5G technology and high-speed Internet, meta universe and full-true Internet come rapidly, immersion media application is developed rapidly, the free viewpoint technology of current innovation can enable audiences to freely select 360-degree arbitrary viewing angles at any moment, user immersive experience is improved, the user can freely switch viewing angles in the process of viewing videos, but because video streams of all viewing angles of pictures at the same moment of multi-machine shooting can have larger time difference to reach a media server, good image quality of the whole picture cannot be ensured, and user experience effect is greatly affected.
Disclosure of Invention
The embodiment of the application provides a media information processing method and device and a computer storage medium, which can improve the video experience effect of a user.
In a first aspect, an embodiment of the present application provides a media information processing method, including:
receiving a plurality of media information streams, wherein the media information streams comprise a plurality of media information packets;
acquiring a first display time stamp of a received target media information packet, wherein the target media information packet is a first received media information packet in all the media information packets;
Taking the first display time stamp as a starting display time stamp of each media information stream;
performing information slicing on each media information stream according to the initial display time stamp to obtain a plurality of pieces of media slicing information of each media information stream, wherein the media slicing information corresponds to slicing serial numbers, and all pieces of media slicing information with the same slicing serial number have the same media duration;
and aggregating the target media slicing information in all the media information streams to obtain free viewpoint media slicing information, wherein the target media slicing information is the media slicing information with the same slicing sequence number.
In a second aspect, an embodiment of the present application further provides a media information processing device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the media information processing method as described above when executing the computer program.
In a third aspect, embodiments of the present application also provide a computer-readable storage medium storing computer-executable instructions for performing a media information processing method as described above.
In the embodiment of the application, the first display time stamp of the obtained target media information packet is uniformly set as the initial display time stamp of each media information stream so as to solve the defect that pictures of each media information stream arrive at a media server at the same moment and are inconsistent, and then the information slicing is carried out on each media information stream according to the initial display time stamp to obtain a plurality of media slicing information, and the media slicing information with the same slicing sequence number in all media information streams is aggregated so as to obtain complete free viewpoint media slicing information, thereby avoiding large-scale space jump of video pictures in the process of switching the view angles of users while ensuring the picture quality; therefore, the embodiment of the application enables the user to realize seamless switching between free viewpoints, improves the video experience effect of the user, and can make up for the technical blank in the related method.
Drawings
FIG. 1 is a flow chart of a media information processing method provided by an embodiment of the present application;
FIG. 2a is a schematic diagram of a plurality of media information streams provided in accordance with one embodiment of the present application prior to alignment;
FIG. 2b is a schematic diagram of a plurality of media information streams aligned according to one embodiment of the present application;
FIG. 3 is a flowchart of a method for processing media information according to another embodiment of the present application, in which a plurality of pieces of media slice information of each media information stream are obtained;
FIG. 4 is a flow chart before obtaining a plurality of pieces of media slice information of each media information stream in a media information processing method according to an embodiment of the present application;
FIG. 5 is a flowchart of obtaining free viewpoint media slice information in a media information processing method according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a media server for performing a media information processing method according to one embodiment of the present application;
FIG. 7 is a flowchart of a media information processing method performed by an alignment module according to one embodiment of the present application;
FIG. 8 is a schematic diagram of a plurality of media information streams provided by another embodiment of the present application;
FIG. 9 is a flowchart of a method for performing media information processing by a stitching module according to one embodiment of the present application;
fig. 10 is a schematic diagram of a media information processing device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical methods and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different from that in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
At present, in order to improve the problem that a large time difference exists when video streams of all visual angles of pictures at the same moment of multi-machine shooting reach a media server, the prior art compresses the video streams of all machines to be spliced into a large picture with ultrahigh resolution, and then image correction is carried out, so that the network bandwidth requirement on a user is relatively high, and in order to adapt to the resolution of a user player, the resolution of the machine needs to be reduced, and the experience effect of the user is greatly influenced.
Based on this, the application provides a media information processing method and device, a computer storage medium and a computer program product. One embodiment of the media information processing method includes: receiving a plurality of media information streams, wherein the media information streams comprise a plurality of media information packets; acquiring a first display time stamp of a received target media information packet, wherein the target media information packet is a first received media information packet in all media information packets; taking the first display time stamp as a starting display time stamp of each media information stream; performing information slicing on each media information stream according to the initial display time stamp to obtain a plurality of media slicing information of each media information stream, wherein the media slicing information corresponds to slicing serial numbers, and all the media slicing information with the same slicing serial number have the same media duration; and aggregating the target media slicing information in all the media information streams to obtain free viewpoint media slicing information, wherein the target media slicing information is the media slicing information with the same slicing sequence number. In this embodiment, the first display time stamp of the obtained target media information packet is uniformly set as the initial display time stamp of each media information stream, so as to solve the defect that pictures of each media information stream arrive at the media server at the same moment and are inconsistent, and then in this case, the information slicing is performed on each media information stream according to the initial display time stamp to obtain a plurality of media slicing information, and the media slicing information with the same slicing sequence number in all media information streams is aggregated to obtain complete free viewpoint media slicing information, so that the video picture is prevented from spatially jumping in a large range in the process of switching the viewing angle of the user while ensuring the picture quality; therefore, the embodiment of the application enables the user to realize seamless switching between free viewpoints, improves the video experience effect of the user, and can make up for the technical blank in the related method.
Embodiments of the present application will be further described below with reference to the accompanying drawings.
As shown in fig. 1, fig. 1 is a flowchart of a media information processing method according to an embodiment of the present application, and the media information processing method may include, but is not limited to, steps S110 to S150.
Step S110: a plurality of media information streams is received, wherein the media information streams include a plurality of media information packets.
In this step, the media information packets in which media information stream are the first received media information packet are determined by receiving a plurality of media information streams so as to accurately distinguish the received plurality of media information streams from each other in a subsequent step.
In an embodiment, the execution subjects of the steps S110 to S150 and the related steps may be selected and set by those skilled in the art according to the specific situations, which is not limited herein. For example, a media server for overall management of each media information stream is taken as an execution subject, that is, a plurality of media information streams are received through the media server, and the following steps S120 to S150 and related steps are executed based on the plurality of media information streams, and corresponding functional modules can be set in the media server to execute corresponding steps so as to achieve a better overall effect, so that a stream receiving module can be set in the media server, and the stream receiving module is used for pulling the media information streams from each machine position of the front end of the free viewpoint and adding the media information streams into a stream receiving buffer queue in the stream receiving module; for another example, other servers, nodes, modules or devices are set as a party for overall management of the media server, that is, the media server is overall managed to indirectly process the multiple media information streams, so that the corresponding servers, nodes, modules or devices can also be used as execution subjects of the steps S110 to S150 and the related steps. It should be noted that, in the following embodiments of the present application, the "media server" is mainly used as the execution subject of the steps S110 to S150 and the related steps, but is not limited thereto.
In one embodiment, the media server is an important device of the next generation network, which provides media resource functions required for implementing various services on the IP network, including service tone provision, conferencing, interactive response, notification, unified messaging, advanced voice services, etc., under the control of a control device (e.g., soft switching device, application server, etc.). In the application server, commands such as playback can be sent to the media server using MSML (Media Server Markup Language ), but not limited to. The media server has better tailorability and can flexibly realize one or more functions, including but not limited to:
dual-tone multi-frequency (Dual-Tone Multi Frequency, DTMF) signal acquisition and decoding function: receiving DTMF signals from the DTMF telephone according to the regulation of relevant operation parameters sent by the control equipment, encapsulating the DTMF signals in a signaling and transmitting the signaling to the control equipment;
the sending function of the recording notification: according to the requirement of the control equipment, playing a specified recording notice to a user by using specified voice;
conference function: the audio mixing function of a plurality of RTP streams is supported, and the mixing of different coding formats is supported;
conversion function between different codec algorithms: support multiple voice coding and decoding algorithms such as G.711, G.723, G.729, etc., and can realize the conversion between the coding and decoding algorithms;
Automatic speech synthesis function: concatenating a plurality of voice elements or fields to form a complete voice prompt notification, the voice prompt notification being fixed or variable;
dynamic voice play/record function: including music-on-hold, follow-me voice services, etc.;
generating and transmitting functions of the sound signals: providing basic signal tones such as dial tone, busy tone, ring back tone, waiting tone, blank tone and the like;
maintenance and management functions of resources: maintenance, management, such as data configuration, fault management, etc., of the media assets and the devices themselves are provided, either locally or/and remotely.
The media server has at least one of the following characteristics:
advanced nature: h.248 of ITU-T and SIP standard protocol can be adopted;
compatibility: the intercommunication can be conveniently finished in soft switching systems of different factories;
high reliability: the gateway provides dual power supplies and supports hot plug; positioning carrier-class equipment and protecting system congestion;
easy maintainability: the communication with SNMP network management is supported, and the system, management resources, post analysis and the like can be maintained on line;
high expansibility and easy upgradeability: the independent application layer can customize various value-added services for users, can update the system online, and meets the needs of the users to the greatest extent;
Flexibility: the flexible networking mode and the strong comprehensive access capability can provide various solutions for users.
In an embodiment, the receiving condition of the media information stream of each machine is not limited, that is, the receiving modes of the media information streams of different machines may be the same, or the receiving mode may be selected according to the specific setting condition, for example, the media information streams of the selected machines in the scene are pulled in a mode of a real-time message transmission protocol (Real Time Messaging Protocol, RTMP), that is, the embodiment of the application ensures that a plurality of media information streams can be received, and the specific receiving mode is not limited herein, so that the application is also applicable to application scenes in which the media information streams are pulled in other modes, because the transmission mode of the media information streams is not limited.
In an embodiment, the media information streams and the receiving timing and number of the media information packets in each media information stream may not be limited, but may be set correspondingly in a specific scene. For example, in a venue, typically more than 50 stations may be set, corresponding to more than 50 media information streams to be received, and since the user needs to enter the venue for viewing at a particular time, the transmission time or play time of the selected media information stream may be set near the particular time so that the user can view the video at the particular time.
Step S120: and acquiring a first display time stamp of the received target media information packet, wherein the target media information packet is the first received media information packet in all media information packets.
In this step, since the defect that the frames of each media information stream arrive at the media server at the same time is required to be solved, that is, for all media information streams, no matter how the sequence of arriving at the media server, all media information streams need to be synchronized, in order to avoid missing or mismatching of the media information streams, at least the first received media information packet needs to be found out as a starting point to improve, so that the first display timestamp of the target media information packet is obtained by finding out the first received media information packet from all media information packets and using the first received media information packet as the target media information packet, so that the display timestamp of all media information packets is aligned with the first display timestamp of the target media information packet in the subsequent step, thereby solving the defect that the media information streams arrive at the media server at the same time.
In one embodiment, the manner of obtaining the first display timestamp of the received target media information packet may be various, which is not limited herein. For example, the display time stamps of all media information packets are aggregated and then all display time stamps are compared to obtain the first display time stamp of the target media information packet therefrom.
Step S130: the first display time stamp is used as a starting display time stamp of each media information stream.
In this step, the first display time stamp is used as the initial display time stamp of each media information stream, so that the display time stamp of each media information stream can be synchronized as the initial display time stamp, and then the display time stamps of all media information streams are kept consistent, so that the defect that each media information stream arrives at the media server at the same time is overcome, and the information slicing and aggregation of each media information stream can be performed according to the initial display time stamp in the subsequent step.
A specific example is given below to illustrate the working principle and flow of the above embodiments.
Example one:
fig. 2a and 2b show a schematic diagram of a plurality of media information flows provided by an embodiment of the present application before alignment, and fig. 2b shows a schematic diagram of a plurality of media information flows provided by an embodiment of the present application after alignment, where a schematic diagram of media information flows corresponding to 3 slots respectively is given as an example, and the media information flows in each slot include a plurality of repeated slices.
Taking a media server as an example, under the condition that all media information packets are received into a receiving and streaming buffer queue, traversing each media information packet in the receiving and streaming buffer queue, judging whether the current media information packet is the first received media information packet, if so, forcibly setting the start point of the first slice of all the machine positions, wherein the start point is the first display time stamp (Presentation Time Stamp, PTS) of the current slice, namely the PTS of the received first media information packet of the current slice (namely the current media information packet), otherwise, storing the media information packet into a linked list of the corresponding machine position, and then repeating the judging process for another media information packet until the first media information packet required is found.
As shown in fig. 2a, a schematic diagram of a media information stream of each machine location is given under the condition of unmodified initial display time stamp, numbers in a box represent PTS of a current media information packet, from which it can be seen that a slicing duration is 6s, a first slicing PTS range of machine location 1 is [ 0-540000 ], startpts is 0, a first slicing PTS range of machine location 2 is [ 7200-547200 ], startpts is 7200, a first slicing PTS range of machine location 3 is [ 3600-543600), and startpts is 3600; because PTS ranges of the initial slices of all the machine positions are inconsistent, the terminal can have the problem of large-range space jumping of pictures when switching among the machine positions.
As shown in fig. 2b, a schematic diagram of a media information stream of each machine in the case of modifying the initial display timestamp is given, taking a media information packet of the first received machine 2 (i.e. the first media information packet in the receive buffer queue is the media information packet of the machine 2) as an example, the slicing duration is 6s, and it can be seen from this that, compared with the original media information stream of the machine, the first slicing PTS range of the machine 1 is [ 0-547200 ], the startpts is 7200, the first slicing PTS range of the machine 2 is [ 7200-547200), the startpts is 7200, the first slicing PTS range of the machine 3 is [ 3600-547200), and the startpts of the second slicing of each machine is 7200, so that, since the startpts of the second slicing of each machine is the same and the slicing is also the same, the same duration can be ensured that the media information stream reaches the same respective media at different times from the first slicing of each machine, and the same media stream can reach the same media.
Step S140: performing information slicing on each media information stream according to the initial display time stamp to obtain a plurality of media slicing information of each media information stream, wherein the media slicing information corresponds to slicing serial numbers, and all the media slicing information with the same slicing serial number have the same media duration;
in this step, since the initial display time stamp of each media information stream has been determined in step S130, the information slicing may be further performed on each media information stream according to the initial display time stamp to obtain a plurality of pieces of media slicing information of each media information stream, and the pieces of media slicing information are distinguished by the slicing sequence numbers, where all pieces of media slicing information having the same slicing sequence number have the same media duration, so that for different media information streams, it is possible to confirm that the pieces of media slicing information of the same time period are obtained by comparing the respective slicing sequence numbers, so that the pieces of media slicing information of the same time period are aggregated into one complete free viewpoint slicing in the subsequent steps.
As shown in fig. 3, step S140 is further described, and step S140 includes, but is not limited to, steps S141 and S142.
Step S141: for each media information stream, acquiring a second display time stamp of a currently received media information packet;
step S142: when the condition of information slicing is met according to the second display time stamp and the initial display time stamp, initial information slicing is carried out according to the currently received media information packet, the second display time stamp is used as a new initial display time stamp, and subsequent information slicing is carried out according to the new initial display time stamp.
In this step, the second display time stamp of the currently received media information packet is obtained, so that the second display time stamp is compared with the aligned initial display time stamp to determine whether an information slicing condition is met, if yes, initial information slicing can be performed according to the currently received media information packet, and meanwhile, subsequent information slicing can be performed by taking the second display time stamp meeting the condition as a new initial display time stamp, so that complete media slicing information of the currently received media information packet can be obtained, and in this step, the media slicing information of the same time period can be aggregated into a complete free viewpoint slicing.
In an embodiment, the information slicing condition may be set correspondingly according to a specific scenario, which is not limited herein. For example, the information slicing conditions may include, but are not limited to,: the ratio of the difference between the second display timestamp and the initial display timestamp to the preset time reference is greater than or equal to the preset slicing time length, wherein the preset time reference can be, but not limited to, the time reference of the corresponding media information stream, when the time lengths of all media information packets are the same, the time length of the media information packets can be, but not limited to, set to the preset slicing time length, the difference between the two display timestamps is used for measuring the difference degree between the second display timestamp and the initial display timestamp, that is, the second display timestamp is large enough to further realize the subsequent information slicing, and when the ratio of the difference between the second display timestamp and the initial display timestamp to the preset time reference is smaller than the preset slicing time length, it can be determined that the information slicing of the currently received media information packets is not needed.
In one embodiment, the manner of obtaining the second display timestamp of the currently received media information packet may be various, which is not limited herein. For example, the display time stamps of all media information packets are summarized and then all display time stamps are compared, thereby obtaining a second display time stamp of the currently received media information packet therefrom.
In an embodiment, after the subsequent information slicing is performed according to the new start display time stamp, the subsequent information slicing may be continued according to the manner of step S142, that is, in the case that the duration of the subsequent information slicing is clear, the next start display time stamp may be determined according to the duration of the information slicing, the last start display time stamp, and the preset time reference, so that the further subsequent information slicing can be performed based on the next start display time stamp.
As shown in fig. 4, an embodiment of the present application further describes steps before steps S141 to S142, including but not limited to steps S160 to S180.
Step S160: detecting whether a first target media information stream exists, wherein the first target media information stream is a media information stream meeting a cut-off recovery condition;
Step S170: when the first target media information stream is detected to exist, acquiring a difference value between a second display time stamp corresponding to the first target media information stream and initial display time stamps corresponding to a plurality of second target media information streams, wherein the second target media information stream is a media information stream which does not meet the cut-off recovery condition;
step S180: and updating the initial display time stamp and the fragment sequence number of the first target media information stream into the initial display time stamp and the fragment sequence number of the second target media information stream corresponding to the target difference value, wherein the target difference value is the smallest value in all the difference values.
In this step, since the interruption recovery affects the subsequent information slicing performed on the media information packet, in step S160, the interruption recovery condition is further determined by detecting whether there is a first target media information stream satisfying the interruption recovery condition, and when detecting that there is a first target media information stream, a difference between a second display time stamp corresponding to the first target media information stream and a start display time stamp corresponding to a plurality of second target media information streams is obtained, that is, considering the difference between the display time stamps between the first target media information stream satisfying the interruption recovery condition and all the second target media information streams not satisfying the interruption recovery condition, the start display time stamp and the slicing sequence number of the second target media information stream corresponding to the target difference are selected from all the second target media information streams as the basis for updating the start display time stamp and the slicing sequence number of the first target media information stream, and because the target difference is the smallest value in all the differences, the start display time stamp and the slicing sequence number of the first target media information stream can be updated to the nearest neighbor display time stamp and the start time stamp of the second target media information stream, so that the bandwidth requirement of the subsequent information can be reduced as much as possible.
In an embodiment, the interruption recovery condition may be set according to a specific scenario, which is not limited herein. For example, the outage restoration conditions may include, but are not limited to: the ratio of the difference between the second display timestamp and the display timestamp of the last received media information packet to the preset time reference is greater than the preset timeout period, wherein the preset time reference may be, but is not limited to, the time reference of the corresponding media information stream, and by comparing the difference between the second display timestamp and the display timestamp of the last received media information packet, the difference between the second display timestamp of the current received media information packet and the display timestamp of the last received media information packet may be measured so as to better determine the actual timeout degree of the second display timestamp, and it may be understood that when the ratio of the difference between the second display timestamp and the display timestamp of the last received media information packet to the preset time reference is less than or equal to the preset timeout period, it may be determined that recovery of the current received media information packet is not required.
Step S150: and aggregating the target media slicing information in all the media information streams to obtain free viewpoint media slicing information, wherein the target media slicing information is the media slicing information with the same slicing sequence number.
In the step, the first display time stamp of the obtained target media information packet is uniformly set as the initial display time stamp of each media information stream so as to solve the defect that pictures of each media information stream arrive at a media server at the same moment and be inconsistent, and then the information slicing is carried out on each media information stream according to the initial display time stamp to obtain a plurality of media slicing information, and the media slicing information with the same slicing serial number in all media information streams is aggregated so as to obtain complete free viewpoint media slicing information, thereby avoiding large-scale space jump of video pictures in the process of switching the view angles of users while ensuring the picture quality; therefore, the embodiment of the application enables the user to realize seamless switching between free viewpoints, improves the video experience effect of the user, and can make up for the technical blank in the related method.
In an embodiment, the target media slice information may be, but is not limited to, media slice information with a slice number of not 1, and referring to the examples of fig. 2a and fig. 2b, it can be known that the display timestamp of the first media slice information in the media information stream of each machine bit is modified to be the first display timestamp of the first received media information packet, where the duration of the first media slice information of each machine bit (i.e. the media slice information with a slice number of 1) is not corresponding to the same, and if the media slice information with a slice number of 1 is directly aggregated, it is not corresponding to the media slice information with a slice number of 2, so that aggregation may be started from the media slice information with a slice number of 2, so as to obtain reliable and stable free viewpoint media slice information.
In an embodiment, the video stream of each machine is not required to be compressed and spliced into a large image with ultrahigh resolution as in the prior art, and then image correction is performed, but the media information stream is subjected to information slicing according to the corresponding display time stamp and is aggregated based on the target media slicing information in the media information stream, so that the final free viewpoint media slicing information is obtained, the requirement on network bandwidth can be greatly reduced, the method is more applicable to users, and the actual influence of the resolution of each machine is not required to be considered by adopting the media slicing information splicing mode of the embodiment of the application, namely, the resolution of each machine is not required to be reduced by adapting the resolution played by the users, so that the experience effect of the users can be further improved.
As shown in fig. 5, step S150 is further described, and step S150 includes, but is not limited to, steps S151 to S153.
Step S151: traversing the target media fragment information in each media information stream in sequence;
step S152, judging whether the current target media slicing information is the first media slicing information after the interruption recovery;
step S153: and if the current target media slicing information is not the first media slicing information after the interruption recovery, aggregating the current target media slicing information.
In this step, by traversing the target media slice information in each media information stream to determine whether the current target media slice information is the first media slice information after the interruption recovery, according to the relevant comments of the above embodiment, it can be known that, because the first media slice information after the interruption recovery is similar to the media slice information with the modified first display timestamp of each machine location, the aggregation cannot be better adapted, therefore, when it is determined that the current target media slice information is not the first media slice information after the interruption recovery, the aggregation of the current target media slice information is selected so as to obtain reliable and stable free viewpoint media slice information, that is, for the interruption recovery situation, the aggregation is performed only until the second media slice information after the interruption recovery, so that the free viewpoint media slice information effect is better.
In one embodiment of the present application, step S150 is further described on the basis of steps S151 to S153, and step S150 further includes, but is not limited to, step S154.
Step S154: if the current target media slicing information is the first media slicing information after the interruption recovery, the current target media slicing information is not aggregated.
In this step, since the first media slice information after the current recovery is similar to the media slice information with the modified first display timestamp of each machine location, the method is also not well suitable for aggregation, and therefore when the current target media slice information is judged to be the first media slice information after the current recovery, the current target media slice information is not aggregated, so that the overall aggregation process of the free viewpoint media slice information is not affected, that is, for the current recovery situation, at least the second media slice information after the current recovery is aggregated, and the effect of obtaining the free viewpoint media slice information is better.
Various specific examples are given below to illustrate the working principles and flow of the above embodiments.
Example two:
fig. 6 is a schematic diagram of a media server for performing a media information processing method according to an embodiment of the present application.
Referring to fig. 6, the media server may include, but is not limited to, a streaming module, an alignment module, and a splicing module, wherein:
the receiving module is used for pulling the media stream (namely the machine position 1 media stream, the machine position 2 media stream and the machine position 3 media stream … machine position n media stream shown in fig. 6) from each machine position at the front end of the free viewpoint and adding the media stream into the receiving buffer queue;
The alignment module is used for taking out the media stream in the stream receiving buffer queue, carrying out alignment treatment on the media stream and then slicing the media stream;
and the splicing module is used for gathering all the machine positions into a complete free viewpoint fragment according to the same fragment sequence number.
According to the above example, through the cooperation of the flow receiving module, the alignment module and the splicing module, the user can realize seamless switching between free viewpoints, and the video experience effect of the user is improved, so that the technical blank in the related method can be made up.
Example three:
the following specifically describes the working principle and flow of the alignment module in example two in detail.
Fig. 7 is a flowchart of a media information processing method performed by the alignment module according to an embodiment of the present application.
Referring to fig. 7, the alignment module may, but is not limited to, perform the steps of:
step a: and b, traversing each media information packet in the receiving and streaming buffer queue, judging whether the current media information packet is the first received media information packet, if so, forcibly setting the start of the first slice of all machine positions as the PTS of the first received media information packet (namely the current media information packet), then entering the step b, and otherwise, directly entering the step b without any processing.
Step b: and respectively storing the media information packets into linked lists corresponding to the machine positions.
Step c: judging whether the machine has a scene of cut-off recovery according to a formula of (documents-lastpts)/timebase > overlay, if so, entering a step d, otherwise, entering a step e, wherein documents represent PTS of the current media information packet of the machine, lastpts represents PTS of the last media packet of the machine, timebase represents time base of a media stream, and overlay represents preset timeout.
Step d: calculating the difference diffpts of the documents and the startpts of other normal machine positions, finding the startpts and segno (segno refers to a fragment sequence number which is increased from 1) of the machine position corresponding to the minimum diffpts, and setting the startpts and the segno as corresponding information in the cut-off recovery machine position; specifically, referring to fig. 8, fig. 8 is a schematic diagram of a plurality of media information flows provided by another embodiment of the present application, numbers in a box represent PTS values of current media information packets, from which it can be seen that when the start of the machine 1 is 0, the start of the machine 2 is 2, the start of the machine 3 is 108000, the start of the machine 3 is 3, and when the machine 1 has a cut-off condition, PTS differences between media information packets corresponding to the machine 2 and the machine 3 are calculated, and PTS values between the machine 2 are 1083600-540000= 543600, and between the machine 2 and the machine 3 are 1083600-1080000 =3600, that is, the machine information start corresponding to the machine 3 and the corresponding information in the machine 1 are set to be equal to each other, so that the machine 1 can be aligned with the machine 3 after cut-off recovery, and when the next media information packet pts= 1080000 of the machine 2 arrives, the machine 2 can be switched to the next piece of machine 2 and the machine 1 can be aligned with the machine 1, so that the machine 1 can be aligned with the machine 1 after cut-off recovery.
Step e: judging whether the machine position meets the slicing condition according to a formula of (documents-start)/timebase > =min_seg_duration, if so, directly slicing, naming the slicing by the segno, and entering a step a after the segno is added with 1, otherwise, directly entering the step a without any processing, wherein the min_seg_duration represents a preset slicing duration.
Example four:
the working principle and flow of the splicing module in the second example are specifically described in detail below.
Fig. 9 is a flowchart of a method for performing media information processing by a splicing module according to an embodiment of the present application.
Referring to fig. 9, the splice module may, but is not limited to, perform the steps of:
step a: b, scanning the fragment information, judging whether the fragment sequence number n to be aggregated is 1, and if not, entering a step b; if yes, the step a is carried out again after the fragment serial number is added with 1, and the time length is inconsistent after the first fragments of all the machine positions are forcedly aligned, so that the aggregation operation is not carried out on the first fragments of all the machine positions.
Step b: and c, traversing the fragments with the same serial numbers of all the machine positions in sequence, namely traversing the fragments with the serial numbers of n of all the machine positions in sequence, judging whether the fragments are the first fragments after the interruption recovery, if so, entering the step b again, otherwise, entering the step c.
Step c: and (3) aggregating the fragments of the machine bit into free viewpoint media fragment information, judging whether all fragments with the machine bit fragment sequence number n are scanned, if so, adding 1 to the fragment sequence number, and then entering the step a, otherwise, entering the step b.
As can be seen from the above examples, in the embodiment of the present application, the first PTS of the initial slice of all the machine positions is forcedly set in the initialization stage, the slice is performed according to the slice duration and the slice sequence number is incremented in the operation stage, when the scene that the machine position is recovered by cutting off is monitored, the first PTS of the current slice of the machine position and the slice sequence number are recalculated, and then all the machine position information in the same time period is aggregated into a complete free viewpoint slice according to the slice sequence number, so that the user can select the viewing angle to play, the problem that the pictures of all the machine position code streams arrive at the media server end at the same time are inconsistent, the large-range space jump of the video picture in the process of switching the viewing angle by the user is avoided, meanwhile, the picture quality is ensured, the bandwidth and performance requirements of the terminal equipment are also reduced, the seamless switching between the free viewpoints is realized by the user, and the video experience effect of the user is improved.
The method of the embodiment of the application can be widely applied to panoramic video generation under VR, virtual viewpoint scenes and the like.
In addition, as shown in fig. 10, an embodiment of the present application also discloses a media information processing device 100, including: at least one processor 110; at least one memory 120 for storing at least one program; the media information processing method as in any of the previous embodiments is implemented when at least one program is executed by at least one processor 110.
In addition, an embodiment of the present application also discloses a computer-readable storage medium in which computer-executable instructions for performing the media information processing method of any of the previous embodiments are stored.
Furthermore, an embodiment of the present application also discloses a computer program product including a computer program or computer instructions stored in a computer-readable storage medium, the computer program or computer instructions being read from the computer-readable storage medium by a processor of a computer device, the processor executing the computer program or computer instructions to cause the computer device to perform the media information processing method as in any of the previous embodiments.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

Claims (10)

1. A media information processing method, comprising:
receiving a plurality of media information streams, wherein the media information streams comprise a plurality of media information packets;
acquiring a first display time stamp of a received target media information packet, wherein the target media information packet is a first received media information packet in all the media information packets;
taking the first display time stamp as a starting display time stamp of each media information stream;
performing information slicing on each media information stream according to the initial display time stamp to obtain a plurality of pieces of media slicing information of each media information stream, wherein the media slicing information corresponds to slicing serial numbers, and all pieces of media slicing information with the same slicing serial number have the same media duration;
and aggregating the target media slicing information in all the media information streams to obtain free viewpoint media slicing information, wherein the target media slicing information is the media slicing information with the same slicing sequence number.
2. The method according to claim 1, wherein said information slicing each of said media information streams according to said start display time stamp, comprises:
And for each media information stream, acquiring a second display time stamp of the currently received media information packet, when the condition of information slicing is determined to be met according to the second display time stamp and the initial display time stamp, performing initial information slicing according to the currently received media information packet, taking the second display time stamp as a new initial display time stamp, and performing subsequent information slicing according to the new initial display time stamp.
3. The media information processing method of claim 2, wherein the information slicing condition comprises:
and the ratio of the difference between the second display time stamp and the initial display time stamp to a preset time reference is greater than or equal to a preset slicing duration.
4. The method according to claim 2, wherein before the information slicing is performed on each of the media information streams according to the start display time stamp, the method further comprises:
detecting whether a first target media information stream exists, wherein the first target media information stream is the media information stream meeting the cut-off recovery condition;
when the first target media information stream is detected to exist, acquiring a difference value between the second display time stamp corresponding to the first target media information stream and the initial display time stamp corresponding to a plurality of second target media information streams, wherein the second target media information stream is the media information stream which does not meet the cut-off recovery condition;
And updating the initial display time stamp and the fragment sequence number of the first target media information stream into the initial display time stamp and the fragment sequence number of the second target media information stream corresponding to a target difference value, wherein the target difference value is the one with the smallest value in all the difference values.
5. The media information processing method of claim 4, wherein the break recovery condition comprises:
and the ratio of the difference between the second display time stamp and the display time stamp of the last received media information packet to a preset time reference is greater than a preset timeout period.
6. The media information processing method of claim 1, wherein the target media slice information is the media slice information with the slice sequence number not 1.
7. The method of media information processing according to claim 6, wherein aggregating target media slice information in all the media information streams comprises:
traversing the target media fragment information in each media information stream in sequence;
judging whether the current target media slicing information is the first media slicing information after the interruption recovery;
And if the current target media slicing information is not the first media slicing information after the interruption recovery, aggregating the current target media slicing information.
8. The method for processing media information according to claim 7, wherein the aggregating all the target media slice information in the media information stream further comprises:
and if the current target media slicing information is the first media slicing information after the interruption recovery, not aggregating the current target media slicing information.
9. A media information processing device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the media information processing method according to any one of claims 1 to 8 when executing the computer program.
10. A computer-readable storage medium storing computer-executable instructions for performing the media information processing method of any one of claims 1 to 8.
CN202210642307.6A 2022-06-08 2022-06-08 Media information processing method and device and storage medium Pending CN117241105A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210642307.6A CN117241105A (en) 2022-06-08 2022-06-08 Media information processing method and device and storage medium
PCT/CN2023/089286 WO2023236666A1 (en) 2022-06-08 2023-04-19 Media information processing method and apparatus, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210642307.6A CN117241105A (en) 2022-06-08 2022-06-08 Media information processing method and device and storage medium

Publications (1)

Publication Number Publication Date
CN117241105A true CN117241105A (en) 2023-12-15

Family

ID=89083156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210642307.6A Pending CN117241105A (en) 2022-06-08 2022-06-08 Media information processing method and device and storage medium

Country Status (2)

Country Link
CN (1) CN117241105A (en)
WO (1) WO2023236666A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108900857B (en) * 2018-08-03 2020-12-11 东方明珠新媒体股份有限公司 Multi-view video stream processing method and device
CN112188307B (en) * 2019-07-03 2022-07-01 腾讯科技(深圳)有限公司 Video resource synthesis method and device, storage medium and electronic device
CN114079813A (en) * 2020-08-18 2022-02-22 中兴通讯股份有限公司 Picture synchronization method, coding method, video playing device and video coding device
CN112954391B (en) * 2021-02-05 2022-12-06 北京百度网讯科技有限公司 Video editing method and device and electronic equipment
CN113259715A (en) * 2021-05-07 2021-08-13 广州小鹏汽车科技有限公司 Method and device for processing multi-channel video data, electronic equipment and medium

Also Published As

Publication number Publication date
WO2023236666A1 (en) 2023-12-14

Similar Documents

Publication Publication Date Title
CN109889543B (en) Video transmission method, root node, child node, P2P server and system
US10123070B2 (en) Method and system for central utilization of remotely generated large media data streams despite network bandwidth limitations
US11758209B2 (en) Video distribution synchronization
CN110933449B (en) Method, system and device for synchronizing external data and video pictures
CA2933465C (en) Communication apparatus, communication data generation method, and communication data processing method
CN112752115B (en) Live broadcast data transmission method, device, equipment and medium
US11284135B2 (en) Communication apparatus, communication data generation method, and communication data processing method
US20160057390A1 (en) Obtaining replay of audio during a conference session
MX2011012652A (en) Method, apparatus and system for reducing media delay.
CN111629158B (en) Audio stream and video stream synchronous switching method and device
CN114245153B (en) Slicing method, slicing device, slicing equipment and readable storage medium
CN107547517B (en) Audio and video program recording method, network equipment and computer device
CN112954433A (en) Video processing method and device, electronic equipment and storage medium
CN110022286B (en) Method and device for requesting multimedia program
JP2014506030A (en) Method and apparatus for managing distribution of content via a plurality of terminal devices in a collaborative media system
CN105656742A (en) Multi-looped network stream media multicast system and method based on MOST
CN111131788B (en) Monitoring resource state detection method and device and computer readable storage medium
US7769035B1 (en) Facilitating a channel change between multiple multimedia data streams
CA2934905A1 (en) Communication apparatus, communication data generation method, and communication data processing method
CN117241105A (en) Media information processing method and device and storage medium
WO2018171567A1 (en) Method, server, and terminal for playing back media stream
CN110602431A (en) Configuration parameter modification method and device
US20190191195A1 (en) A method for transmitting real time based digital video signals in networks
CN110798725A (en) Data processing method and device
CN112784108A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication