CN117812289A - Audio and video transcoding method and device and electronic equipment - Google Patents

Audio and video transcoding method and device and electronic equipment Download PDF

Info

Publication number
CN117812289A
CN117812289A CN202311865763.8A CN202311865763A CN117812289A CN 117812289 A CN117812289 A CN 117812289A CN 202311865763 A CN202311865763 A CN 202311865763A CN 117812289 A CN117812289 A CN 117812289A
Authority
CN
China
Prior art keywords
video
audio
code rate
category information
videos
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311865763.8A
Other languages
Chinese (zh)
Inventor
郭兆亮
雷威
王跃华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202311865763.8A priority Critical patent/CN117812289A/en
Publication of CN117812289A publication Critical patent/CN117812289A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • H04N21/2335Processing of audio elementary streams involving reformatting operations of audio signals, e.g. by converting from one coding standard to another
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234381Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the temporal resolution, e.g. decreasing the frame rate by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The application discloses an audio and video transcoding method, an audio and video transcoding device and electronic equipment, wherein playing order information of each audio and video in a plurality of audio and videos and category information of each audio and video can be obtained; determining a code rate corresponding to the category information of the audio and video; transcoding the plurality of audios and videos respectively according to the determined code rate; and combining the plurality of audios and videos into one audio/video according to the playing order information. Because the code rate determined by the method corresponds to the category information of the audio and video, when the category information of the audio and video is different, the code rate can be different, so that one audio and video obtained by combining a plurality of audio and video comprises a plurality of audio and video fragments with different code rates. The method and the device can avoid the problems of occupying higher storage space and network bandwidth caused by higher fixed code rate, and also avoid the problem of lower quality of audio and video caused by lower fixed code rate. The present application can balance between data volume and quality.

Description

Audio and video transcoding method and device and electronic equipment
Technical Field
The present application relates to the field of internet technologies, and in particular, to an audio and video transcoding method, an audio and video transcoding device, and an electronic device.
Background
Transcoding is a technique that processes audio and video in one coding format into audio and video in another coding format.
The existing transcoding technology controls the transcoding process of the audio and video based on a preset fixed code rate, and the code rate of the audio and video obtained after transcoding is the preset fixed code rate. For example: when transcoding video, the code rate of each video frame in the same video obtained after transcoding by the existing transcoding technology is the same.
If the fixed code rate is higher, the data volume of the audio and video obtained after transcoding can be greatly increased, and more storage space and network bandwidth can be occupied. If the fixed code rate is low, the quality of the audio and video is reduced, for example: reducing video definition and audio quality.
How to balance the data volume and the quality is still a technical problem to be solved in the field.
Disclosure of Invention
In view of the above, the present application provides an audio/video transcoding method, device and electronic equipment, which are used for solving the problem that the data volume and quality are difficult to reach.
In order to achieve the above object, the following solutions have been proposed:
an audio-video transcoding method, the audio-video transcoding method comprising:
The method comprises the steps of obtaining playing order information of each audio and video in a plurality of audio and video and category information of each audio and video, wherein the category information is used for indicating categories of sound and/or images output by the audio and video when being played, and the playing order information is used for indicating the playing order of the audio and video in the plurality of audio and video;
determining a code rate corresponding to the category information of the audio and video;
transcoding the plurality of audios and videos respectively according to the determined code rate;
and combining the plurality of audios and videos into one audio and video according to the playing order information.
Optionally, the method further comprises:
obtaining a preset importance level of the audio and video according to a preset corresponding relation, wherein the preset importance level and the audio and video have the preset corresponding relation, and the preset importance level is used for indicating the importance level of the audio and video;
the determining the code rate corresponding to the category information of the audio and video comprises the following steps:
and determining code rates corresponding to the category information and the preset importance level of the audio and video.
Optionally, the determining the code rate corresponding to the category information of the audio and video and the preset importance level includes:
and filling the determined category information and the preset importance level into a preset code rate acquisition function as parameters, and operating the preset code rate acquisition function to obtain the code rate corresponding to the category information and the preset importance level of the audio and video.
Optionally, the audio and video are video, and the parameters of the preset code rate acquisition function further include: at least one of a type of terminal device, a resolution of video, and a coding type of video.
Optionally, each audio and video is a video frame group;
the play order information includes: the range of the frame number of the video frame group in the combined audio and video and/or the range of the display time stamp of the video frame group in the combined audio and video.
Optionally, before the obtaining the playing order information of each audio and video in the plurality of audio and video and the category information of each audio and video, the method further includes:
dividing the audio and video with the number of the contained video frames exceeding the preset number into at least two audio and video, wherein each audio and video obtained after division is a video frame group, and the category information of each audio and video obtained after division is the same as the category information of the audio and video with the number of the contained video frames exceeding the preset number.
Optionally, the plurality of audios and videos are obtained by dividing at least one audio and video, and category information of the divided audios and videos with adjacent relation in the at least one audio and video is different.
Optionally, the method further comprises:
transmitting the combined audio and video code stream to a terminal device;
obtaining feedback information of the user of the terminal equipment on the combined audio and video;
adjusting the preset importance level of at least one of the plurality of audios and videos according to the feedback information;
and returning to the step of determining the code rate corresponding to the category information of the audio and video and the preset importance level.
An audio-video transcoding device, the audio-video transcoding device comprising:
the information obtaining unit is used for obtaining the playing order information of each audio and video in the plurality of audio and video and the category information of each audio and video, wherein the category information is used for indicating the category of sound and/or image output by the audio and video when playing, and the playing order information is used for indicating the playing order of the audio and video in the plurality of audio and video;
the code rate determining unit is used for determining the code rate corresponding to the category information of the audio and video;
the transcoding unit is used for transcoding the plurality of audios and videos respectively according to the determined code rate;
and the merging unit is used for merging the plurality of audios and videos into one audio and video according to the playing order information.
An electronic device comprising at least one processor, and at least one memory, bus connected to the processor; the processor and the memory complete communication with each other through the bus; the processor is configured to invoke the program instructions in the memory to perform any of the audio-video transcoding methods described above.
The application provides an audio and video transcoding method, an audio and video transcoding device and electronic equipment, wherein playing order information of each audio and video in a plurality of audio and videos and category information of each audio and video can be obtained; determining a code rate corresponding to the category information of the audio and video; transcoding the plurality of audios and videos respectively according to the determined code rate; and combining the plurality of audios and videos into one audio/video according to the playing order information. Because the code rate determined by the method corresponds to the category information of the audio and video, when the category information of the audio and video is different, the code rate can be different, so that one audio and video obtained by combining a plurality of audio and video comprises a plurality of audio and video fragments with different code rates. For some categories of audio and video (such as advertisements and feature films) needing to ensure the playing quality, the code rate of the audio and video can be higher; correspondingly, if the requirements of some categories of audio and video (such as a head and a tail) on playing quality are not high, the code rate of the audio and video can be lower. Therefore, the problem that the fixed code rate is higher and occupies higher storage space and network bandwidth can be avoided, and the problem that the quality of audio and video is lower due to the lower fixed code rate is also avoided. It can be seen that the present application can strike a balance between data volume and quality.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings may be obtained according to the provided drawings without inventive effort to a person skilled in the art.
Fig. 1 is a flow chart of an audio/video transcoding method according to an embodiment of the present application;
fig. 2 is a schematic diagram of code rate comparison provided in an embodiment of the present application;
fig. 3 is a flowchart of another audio/video transcoding method according to an embodiment of the present application;
fig. 4 is a schematic diagram of a code rate corresponding to category information and importance level of an audio and video according to an embodiment of the present application;
fig. 5 is a schematic diagram of audio/video segmentation according to an embodiment of the present application;
fig. 6 is a schematic diagram of further audio/video segmentation according to an embodiment of the present application;
fig. 7 is a flowchart of another audio/video transcoding method according to an embodiment of the present application;
fig. 8 is a flowchart of another audio/video transcoding method according to an embodiment of the present application;
Fig. 9 is a schematic structural diagram of an audio/video transcoding device according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Fig. 1 is a flow chart of an audio/video transcoding method according to an embodiment of the present application. As shown in fig. 1, an embodiment of the present application provides an audio/video transcoding method, which may include:
s10, obtaining playing order information of each audio and video in the plurality of audio and video and category information of each audio and video, wherein the category information is used for indicating the categories of sound and/or images output by the audio and video during playing, and the playing order information is used for indicating the playing order of the audio and video in the plurality of audio and video.
Wherein the audio-video may include audio and/or video. Wherein the video may be with or without sound. The sound of a video may be referred to as an audio track of the video. The audio/video transcoding method is applicable to audio/video transcoding methods provided by the application, whether the audio/video transcoding method is simple audio, video with audio tracks or video without audio tracks.
If the audio/video is video, the audio/video may take GOP (Group of Pictures ) as a basic unit, and one audio/video may include at least one GOP. The GOP may be a set of video frames, among others, that may include one key frame (I-frame) and some non-key frames (P-frames and B-frames).
Optionally, part or all of the plurality of audios and videos may be audios and videos obtained by dividing at least one audio and video. For example: the plurality of audios and videos comprise five videos altogether, wherein three videos are obtained by dividing one movie video. The category information of the three videos may be respectively: a film head, a film body and a film tail. The other two videos are advertisement, but are advertisement videos of different products respectively, and the two videos are not obtained by dividing a certain video.
The audio/video obtained after the division may be referred to as an audio/video clip with respect to the audio/video before the division. Alternatively, in practical applications, the above-mentioned segmentation process may be performed multiple times, for example: according to the method, firstly, one audio and video is divided into a plurality of audio and videos, then at least one audio and video obtained through division is divided again, and the video after the division again is obtained.
If the audio and video are audio, the playing sequence information is the time range of the audio and video in the playing time axes of the audio and video.
The playing order information of the audio and video may be the playing order of the audio and video in a plurality of audio and video. Optionally, when each audio and video may be a video frame group, the playing order information of the audio and video may include: the range of the frame number of the video frame group in the combined audio and video and/or the range of the display time stamp of the video frame group in the combined audio and video.
The range of the frame number may be: one video frame is combined in the range of the frame number of the combined audio and video, for example: in step S13, a total of 30000 video frames are shared in one audio/video frame, and the playing order information of the multiple audio/video frames combined into the audio/video frame may be the playing order of the audio/video frames in the 30000 video frames, where the playing order may be represented by a video frame number, the number of the first video frame may be 0, and the number of the subsequent video frame may be gradually increased in a +1 manner. At this time, the playing order information of the audio and video signals combined into the audio and video signal may be: 0 to 999; 1000-9999; … … 23000, 23000-29999.
The PTS (Presentation Time Stamp ) may refer to presentation time of a video frame on a time axis, and may be a basic time unit (for example, if a clock frequency of a video is 90kHz, one basic time unit is 1/90000 seconds). The presentation timestamp range may represent: a video frame is combined in a time range in a play time axis of the combined audio and video, for example: and a video frame group is displayed in 0-30 seconds of the playing time axis of the combined audio and video.
When each audio and video can be a video frame group, the combined audio and video can also be a video frame group. Each video frame in the combined audio and video may have a PTS value that specifies at which point in time on the playback timeline the video frame should be presented. Specifically, each video frame in the combined audio and video has a PTS value, and the video frame may be embedded into the code stream according to the PTS value. When decoding the combined audio and video, the PTS value of each video frame in the code stream can be analyzed, the display time of the video frames on the playing time axis is determined according to the PTS value, and the video frames are played one by one according to the sequence of the PTS values. PTS is very important in video editing, transcoding, compression, playing and other processes, and accuracy and continuity of PTS can influence quality and smoothness of video. Therefore, during audio and video processing, accuracy and continuity of the PTS value need to be ensured so that the audio and video can be normally displayed.
The category information of the audio and video may be: the category of sound and/or image output by the audio and video when played. Optionally, when the plurality of videos are videos obtained by dividing at least one of the videos, the category information of the divided videos having an adjacent relationship in the at least one of the videos is different. Namely: when the audio and video are segmented, the segmentation can be performed according to the category information, and when the category information of the audio and video clips is changed, the segmentation can be performed.
For the video without the audio track, the method and the device can determine the category information according to the image output when the video is played; for the video with the audio track, the application can determine the category information according to the sound and/or the image output when the video is played; for audio, the present application may determine category information from sound output when video is played.
Wherein, when classifying according to sound, it can be classified into three categories of low frequency, medium frequency and high frequency; the sound can be classified according to the loudness of the sound, and the sound is classified into two categories of loud sound and small sound; the sound can be classified according to sound properties, and can be classified into noise and non-noise; sound sources can also be classified into two categories, natural sounds (sounds in nature, such as wind sounds, rain sounds, etc.) and artificial sounds (sounds such as music, speech sounds, etc.). Of course, the above classification method is merely an example, and in practical application, the classification information may be determined based on the existing classification method for sound. For example: if the sound output during the audio/video playing is a song, each song may be classified into one category, or a certain song may be classified into at least two of a Pre-song (Intro), a main song (Verse), a sub-song (chord), a guide song (Pre chord), an inter-song (Interude), a Bridge segment (Bridge), and an end-song (Outro).
Alternatively, the audio or video may comprise a plurality of audio tracks, for example: a video is provided with a speech track (which may include human language such as dialogue, lecture, and recitation, etc.), a music track (which may include music played by a musical instrument such as piano, guitar, etc.), and an audio track (which may include effect sounds such as ambient sounds and action sounds, etc.). The present application may determine category information from at least one track in audio or at least one track carried by video.
When the embodiment classifies the images output by the audio and video during playing, the images can be classified according to the names of the audio and video playing images, and the images are classified into different categories such as advertisements, film heads, film tails, positive films and the like; the method can also be classified into two categories of animation and real person according to whether the audio/video playing image is animation or not; the method can be classified according to specific display objects in audio and video playing images, and the specific display objects can be classified into different categories such as people, scenes, static objects, animals and the like; the audio/video playing images can be classified according to the information such as the color, the brightness and the like of the audio/video playing images, and are classified into two categories of night and daytime.
The embodiment does not limit the category information of the audio and video.
S11, determining the code rate corresponding to the category information of the audio and video.
The code rate refers to the data transmission rate of audio or video data stored in a file, and may be used to represent the size of the amount of data stored per second. For audio and video, the code rate may represent the amount of audio and video data stored per unit time. The common audio and video code rate unit may be Mbps (megabits per second), and may represent the number of megabits stored per second. A higher code rate may indicate that the audio and video has higher audio and video quality and richer detail presentation, and may occupy more storage space. According to the embodiment, the code rate corresponding to the category information of the audio and video can be determined according to the category information of the audio and video, and the audio and video is transcoded according to the determined code rate, so that different code rates of the audio and video are realized.
S12, transcoding the plurality of audios and videos according to the determined code rate.
The embodiment can use the existing transcoding technology to transcode the audio and video.
Specifically, since the code rate determined in step S11 is different from the category information of the audio and video, when the category information of the audio and video is different, the code rates may also be different, and at this time, step S12 may convert the plurality of audio and video into different code rates.
S13, combining the plurality of audios and videos into one audio/video according to the playing order information.
The audio and video merging technology is the prior art, and the method and the device can set the position of each audio and video in the plurality of audio and videos after merging according to the playing sequence information. For example: the playing order information of the audio and video comprises: when the video frame is combined in the range of the frame number in the combined audio and video, if the playing order information of the audio and video is respectively: 0 to 999; 1000-9999; … … 23000-29999, the positions of the combined audios and videos in the audios and videos can be set in a gradually increasing and continuously arranging mode according to the frame number range, so that the combined audios and videos can be played in sequence according to the mode during normal playing.
The application provides an audio/video transcoding method, which can obtain the playing sequence information of each audio/video in a plurality of audio/videos and the category information of each audio/video; determining a code rate corresponding to the category information of the audio and video; transcoding the plurality of audios and videos respectively according to the determined code rate; and combining the plurality of audios and videos into one audio/video according to the playing order information. Because the code rate determined by the method corresponds to the category information of the audio and video, when the category information of the audio and video is different, the code rate can be different, so that one audio and video obtained by combining a plurality of audio and video comprises a plurality of audio and video fragments with different code rates. For some categories of audio and video (such as advertisements and feature films) needing to ensure the playing quality, the code rate of the audio and video can be higher; correspondingly, if the requirements of some categories of audio and video (such as a head and a tail) on playing quality are not high, the code rate of the audio and video can be lower. Therefore, the problem that the fixed code rate is higher and occupies higher storage space and network bandwidth can be avoided, and the problem that the quality of audio and video is lower due to the lower fixed code rate is also avoided. It can be seen that the present application can strike a balance between data volume and quality.
Fig. 2 is a schematic diagram of code rate comparison according to an embodiment of the present application. As shown in fig. 2, the code rate of an audio and video calculated in the prior art is fixed, and the code rate of the whole audio and video is the same and is unchanged. The code rates determined by the audio/video transcoding method provided by the embodiment of the application are different from each other in different audio/video, and the code rates comprise three code rates, namely, 1 to 3.
Fig. 3 is a schematic diagram of another audio/video transcoding method according to an embodiment of the present application, where, as shown in fig. 3, the audio/video transcoding method may include:
s09: obtaining a preset importance level of the audio and video according to the preset corresponding relation, wherein the preset importance level has the preset corresponding relation with the audio and video, and the preset importance level is used for indicating the importance degree of the audio and video;
the preset corresponding relation can be a corresponding relation between a preset importance level and an audio/video, can be set and modified, is stored in a table form, and is convenient for inquiring the table to obtain the preset importance level corresponding to the audio/video. The importance level may represent the importance level of the audio-visual contents, or the importance level of the category information of the audio-visual contents, and the importance level may be represented by a number.
Optionally, the preset importance level may also be obtained after identifying the sound and/or image output during playing the audio and video.
S10, obtaining playing order information of each audio and video in a plurality of audio and videos and category information of each audio and video, wherein the category information is used for indicating categories of sounds and/or images output by the audio and video during playing, and the playing order information is used for indicating the playing order of the audio and video in the plurality of audio and videos;
s111, determining code rates corresponding to category information and preset importance levels of the audio and video.
Step S111 is a specific implementation of step S11 in the method shown in fig. 1.
S12, transcoding the plurality of audios and videos respectively according to the determined code rate;
s13, combining the plurality of audios and videos into one audio/video according to the playing order information.
The steps S10, S12, S13 are described in the embodiment shown in fig. 1, and are not repeated.
In this embodiment, not only can audio and video be distinguished by category information, but also the importance level of the audio and video can be further represented. When the code rate of the audio and video is determined according to the category information and the importance level of the audio and video, the difference of the code rates between the audio and video can be further embodied, and the high code rate requirement of important audio and video and the low code rate requirement of unimportant audio and video can be further met.
Specifically, when the category information of the audio and video is the same, if the importance level of the audio and video is higher, higher code rate and video quality can be obtained. For example, the larger the number of importance levels, the higher the importance levels. If the category information is positive, the positive k+1 level can have higher code rate and video quality than the positive k level; when the importance levels of the audio and video are the same, the embodiment can distinguish the code rate and the video quality of the audio and video according to the category information of the audio and video. For example, if the importance levels are all 0 level, the advertisement 0 level may have higher code rate and video quality than the feature 0 level. Fig. 4 is a schematic diagram of a code rate corresponding to category information and importance level of an audio/video according to an embodiment of the present application. As shown in the schematic diagram of the code rate of the audio and video shown in fig. 4, one audio and video may be divided into five audio and video (a head, an advertisement a, an advertisement B, a feature film and a tail), and a black bold line may represent the code rate of each audio and video. The category information and the importance level of the head and the tail are the same, and the determined code rate is the same; the category information of the advertisement A and the advertisement B are the same, but the importance level of the advertisement A is higher than that of the advertisement B, and the code rate of the advertisement A is higher than that of the advertisement B; the importance level of the advertisement A and the feature film is the same, but the category information of the advertisement A and the category information of the feature film are different, and the code rate of the advertisement A is different from the code rate of the feature film.
After the embodiment marks a segmentation position of an audio and a video and sets category information and importance level of the audio and the video, an info. Txt (plain text) file can be generated. Specifically, the info. Txt file may be expressed as:
clip_index:0,frame_range:[0,999],pts_range:[0,39.960],clip_type:episode,clip_level:0
clip_index:1,frame_range:[1000,9999],pts_range:[40,399.960],clip_type:advertise,clip_level:2
wherein clip_index can indicate what audio/video this is; frame_range may represent a sequence number range of a video frame contained in the current audio/video; pts_range can represent the display time stamp range of the current audio and video, namely the display time range of the current audio and video on a time axis; clip_type may represent category information of a current audio/video; clip_level may represent the importance level of the current audio/video. After the info. Txt file is obtained, the embodiment can divide one audio/video according to the frame_range and pts_range in the info. Txt file, so as to obtain multiple audio/videos. Fig. 5 is a schematic diagram of audio/video segmentation according to an embodiment of the present application. As shown in fig. 5, the audio and video may be divided into two audio and video of advertisement and feature film (wherein advertisement and feature film are category information). The importance level of the advertisement is 2; the importance level of the feature is 1. The advertisement contains video frames with sequence numbers ranging from [0, 999]; the sequence number range of the video frames contained in the feature film is [1000, 9999]. The advertisement has a presentation timestamp range of [0, 39.960]; the presentation time stamp range for the feature is [40, 399.960]. Of course, the audio and video shown in fig. 5 is only an example, and in actual situations, the number of audio and video may be plural.
Furthermore, in this embodiment, the number of the included video frames exceeds the preset number of the audio and video frames may be divided into at least two audio and video frames, each audio and video frame obtained after division is a video frame group, and the category information of each audio and video frame obtained after division is the same as the category information of the audio and video frame including the number of the video frames exceeding the preset number. Further, in this embodiment, the audio and video frames with the number exceeding the preset number may be replaced with each audio and video obtained after the segmentation.
Specifically, the preset number of frames may be determined according to actual needs. The purpose of the further segmentation in this embodiment is to improve the subsequent audio/video transcoding speed. Each audio and video obtained after the segmentation can inherit the category information of the audio and video before the segmentation, and of course, the importance level of the audio and video before the segmentation can also be inherited. After the division, for the integrity of the audio and video before the division, the audio and video before the division can be replaced by each audio and video obtained after the division. Fig. 6 is a schematic diagram of further audio/video segmentation according to an embodiment of the present application. As shown in fig. 6, the audio/video before division can be divided into a head, an advertisement a, an advertisement B, a feature film, an advertisement C, and a tail. If the video frames contained in the feature film are too many, and the feature film is too long, the embodiment may further divide the feature film into feature film a, feature film B and feature film C (each audio/video obtained after the division). And the positive A, the positive B and the positive C replace the positions of the positive in the audio and video before segmentation, and finally a complete audio and video is obtained.
In another audio/video transcoding method provided according to an embodiment of the present application, the step S111 may specifically include: and filling the determined category information and the determined importance level into a preset code rate acquisition function as parameters, and running the preset code rate acquisition function to obtain the code rate corresponding to the category information and the importance level of the audio and video.
The code rate obtaining function may be a pre-written function for calculating the code rate by using a plurality of parameters, and the code expression form may be: get_video_bit (parameter 1, parameter 2, … parameter N), where the parameters in the code rate acquisition function may be parameters that have an impact on the code rate. In this embodiment, parameters in the code rate acquisition function may include category information and importance levels of the audio and video. Further, when the audio and video are video, the parameters of the code rate obtaining function may further include: at least one of a type of terminal device, a resolution of video, and a coding type of video. The terminal equipment can be equipment for interacting a user with the computer system and can comprise general terminal equipment, special terminal equipment, intelligent terminal equipment and other types; the resolution of a video may refer to the sharpness of a video picture, and may be in pixels; the coding type of the video can determine the compression and storage mode of the video data, and the coding type of the video can comprise: H.264/AVC, H.265/HEVC and VP8/VP9. The present embodiment can select three parameters according to actual situations. For example, when the format of the audio/video is acceptable to all terminal apparatuses, only two parameters of the resolution of the video and the encoding type of the video may be selected. For another example, when only one resolution of audio and video is required, only two parameters of the type of the terminal device and the type of encoding of the video may be selected. Other parameter selection scenarios may be similarly followed.
When determining the code rate of the audio and video, the present embodiment may directly calculate the code rate corresponding to the category information and the importance level of the audio and video according to a code rate acquisition function including at least three parameters (at least one of category information, importance level and type of terminal device, resolution of video, and coding type of video).
Of course, the present embodiment may also calculate the basic code rate of the audio and video according to the code rate acquisition function including at least one of the type of the terminal device, the resolution of the video, and the coding type of the video. Because the category information and the importance level can distinguish audio and video, for example, the positive 0 level and the positive 1 level have different code rates, the positive 0 level and the advertisement 0 level have different code rates, and the advertisement 1 level and the advertisement 2 level have different code rates. The embodiment can determine the code rate weight coefficient of the audio and video according to the category information and the importance level, and determine the product of the code rate weight coefficient of the audio and video and the basic code rate of the audio and video as the code rate corresponding to the category information and the importance level of the audio and video.
In the process of determining the code rate weight coefficient of the audio and video according to the category information and the importance level, the numerical value corresponding to the category information can be determined according to the category information firstly because the category information is not in a direct numerical value form. Alternatively, the numerical value can be set directly by a person, or can be automatically determined according to a correspondence table of category information and numerical values stored manually in advance. The value of the importance level may be a rank number corresponding to the importance level, for example, the importance level is 1, and the value may be 1.
For example, the audio and video may be a episode of a television, and may be divided into four audio and video of a top, a positive, an advertisement and a tail according to the content of the television. The embodiment can sequentially set the category information of the four audios and videos as follows: the importance level can be set as in proper order: level 2, level 3, level 1 and level 2.
The embodiment calculates the basic code rate of four audios and videos, determines a first numerical value (1, 2, 0.5 and 1 are assumed) corresponding to the category information according to the category information of each audio and video, determines a class number corresponding to the importance level of each audio and video as a second numerical value (2, 3, 1 and 2) corresponding to the importance level, and determines the code rate weight coefficient of each audio and video according to the first numerical value and the second numerical value. The product of the basic code rate of each audio and video and the code rate weight coefficient of each audio and video can be determined as the code rate corresponding to the category information and the importance level of each audio and video. For example, the category information is an audio/video of an advertisement, the basic code rate is 8Mbps obtained by calculation according to the code rate obtaining function, the first value corresponding to the category information is 0.5, the level number of the importance level is 1, the code rate weight coefficient of the audio/video is 0.5 (0.5=0.5×1), and the code rate corresponding to the category information and the importance level of the audio/video is 4Mbps (4 mbps=0.5×8 Mbps).
Of course, the present embodiment may also refer to the table according to different value combinations of multiple parameters to obtain a code rate corresponding to the value combination, and use the code rate as a code rate corresponding to the category information and the importance level of the audio and video. The table may be a pre-written correspondence table, where the correspondence table may include a correspondence between all different parameter value combinations of at least one parameter (including category information of audio and video) and a code rate.
Fig. 7 is a flowchart of another audio/video transcoding method according to an embodiment of the present application. As shown in fig. 7, with respect to the embodiment shown in fig. 3, in another audio/video transcoding method provided in the embodiment of the present application, the method may further include:
s14: transmitting the combined audio and video code stream to a terminal device;
s15: obtaining feedback information of a user of the terminal equipment on the combined audio and video;
s16: adjusting the importance level of at least one audio/video in the plurality of audio/videos according to the feedback information; the above S111 is executed back.
The feedback information may be direct opinion feedback of the user on the combined audio and video code stream or the collected adjustment operation of the user on the combined audio and video code stream. The direct opinion feedback can be the evaluation of the user on the combined audio and video code stream, for example, the user evaluation indicates that one audio and video is fuzzy and bad in appearance. The adjustment operation may be an operation of adjusting the playing progress of the code stream by the user, and optionally, the operation of responding to the adjustment of the playing progress of the code stream may be a touch operation or a drag operation. According to the embodiment, the category information and/or the importance level of the audio and video can be adjusted according to the adjustment operation of the user on the audio and video in the watching process or the evaluation of the code rate corresponding to the category information and the importance level of the audio and video.
Specifically, the adjustment of the user to the audio and video (for example, for a certain audio and video, the user has a pull-back operation or a fast-forward operation, which indicates that the user pays attention to the content in the audio and video or does not pay attention to the content in the audio and video) may be collected, the feedback data of the user may be collected, and the category information and/or the importance level of the audio and video may be readjusted according to the feedback data. When a new combined audio and video code stream is generated next time, the code rate corresponding to the audio and video category information and the importance level is redetermined according to the adjusted category information and the importance level. For example, when a user views the combined audio and video code stream, there is an operation of repeatedly pulling back the progress bar on one audio and video, so that the audio and video is displayed for a longer time at the terminal device of the user, which can indicate that the user focuses on the audio and video, and the collected feedback information can include: the display time of the audio and video is longer than that of other audio and video, so that the importance level of the audio and video can be improved, and higher code rate and video quality can be obtained.
Furthermore, the embodiment can also produce the complete code streams with multiple resolutions in the production process of the complete code rate, so that the user can directly replace the complete code streams according to the watching experience of the user.
The following provides a specific implementation procedure of an audio/video transcoding method of the present application:
fig. 8 is a flowchart of another audio/video transcoding method according to an embodiment of the present application. As shown in fig. 8, the implementation process of the audio/video transcoding method may include:
s801, obtaining playing order information of each audio and video in a plurality of audio and videos set by a user, category information of each audio and video and preset importance level of each audio and video.
Specifically, the play order information, the category information, and the preset importance level may be stored in an information document, and the information in the document may be specifically:
clip_index:0,frame_range:[0,999],pts_range:[0,39.960],clip_type:episode,clip_level:0
clip_index:1,frame_range:[1000,9999],pts_range:[40,399.960]clip_type:advertise,clip_level:2
the information in this document is already described in the foregoing embodiments, and will not be described in detail.
S802, dividing the audio and video based on the playing order information obtained in step S801, so as to obtain a plurality of audio and video mentioned in step S801.
S803, performing secondary segmentation on at least one audio and video obtained by segmentation in the step S802.
The audio and video obtained after the secondary segmentation is the same as the category information and the preset importance level of the audio and video before the secondary segmentation.
S804, filling the determined category information and the importance level into a preset code rate acquisition function as parameters, and running the preset code rate acquisition function to obtain the code rate corresponding to the category information and the importance level of the audio and video.
The preset code rate acquisition function is as follows: get_video_bit (parameter 1, parameter 2 … parameter N).
S805, transcoding the plurality of audios and videos respectively according to the determined code rate;
s806, combining the plurality of audios and videos into one audio and video according to the playing order information.
Corresponding to the embodiment of the method, the application also provides an audio/video transcoding device.
Fig. 9 is a schematic structural diagram of an audio/video transcoding device according to an embodiment of the present application, as shown in fig. 9, where the audio/video transcoding device includes:
an information obtaining unit 901, configured to obtain playing order information of each audio and video in a plurality of audio and video and category information of each audio and video, where the category information is used to indicate a category of sound and/or image output by the audio and video when playing, and the playing order information is used to indicate a playing order of the audio and video in the plurality of audio and video;
a code rate determining unit 902, configured to determine a code rate corresponding to the category information of the audio and video;
the transcoding unit 903 is configured to transcode the plurality of audio/video signals according to the determined code rate;
and a merging unit 904, configured to merge the plurality of audios and videos into one audio and video according to the playing order information.
Optionally, the audio/video transcoding device further includes:
the level obtaining unit is used for obtaining a preset importance level of the audio and video according to a preset corresponding relation, wherein the preset importance level and the audio and video have the preset corresponding relation, and the preset importance level is used for indicating the importance level of the audio and video;
the code rate determining unit is specifically configured to:
and determining code rates corresponding to the category information and the preset importance level of the audio and video.
Optionally, the code rate determining unit is specifically configured to:
and filling the determined category information and the preset importance level into a preset code rate acquisition function as parameters, and operating the preset code rate acquisition function to obtain the code rate corresponding to the category information and the preset importance level of the audio and video.
Optionally, the audio and video are video, and the parameters of the preset code rate acquisition function further include: at least one of a type of terminal device, a resolution of video, and a coding type of video.
Optionally, each audio and video is a video frame group;
the play order information includes: the range of the frame number of the video frame group in the combined audio and video and/or the range of the display time stamp of the video frame group in the combined audio and video.
Optionally, the audio/video transcoding device further includes:
the information obtaining unit is used for obtaining the playing order information of each audio and video in the plurality of audio and videos and the category information of each audio and video, dividing the audio and video with the number of the contained video frames exceeding the preset number into at least two audio and videos, wherein each audio and video obtained after division is a video frame group, and the category information of each audio and video obtained after division is the same as the category information of the audio and video with the number of the contained video frames exceeding the preset number.
Optionally, the plurality of audios and videos are obtained by dividing at least one audio and video, and category information of the divided audios and videos with adjacent relation in the at least one audio and video is different.
Optionally, the audio/video transcoding device further includes:
the sending unit is used for sending the combined audio and video code stream to the terminal equipment;
the feedback obtaining unit is used for obtaining feedback information of the user of the terminal equipment on the combined audio and video;
and the adjusting unit is used for adjusting the preset importance level of at least one of the plurality of audios and videos according to the feedback information, and triggering the code rate determining unit to determine the code rate corresponding to the category information and the preset importance level of the audios and videos again.
Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 10, the present application further provides an electronic device 1000, where the electronic device 1000 includes at least one processor 1001, and at least one memory 1002 and a bus 1003 connected to the processor 1001; wherein, the processor 1001 and the memory 1002 complete communication with each other through the bus 1003; the processor 1001 is configured to invoke program instructions in the memory 1002 to execute any of the audio/video transcoding methods provided in the embodiments of the present application.
The electronic device herein may be a server, a PC, a PAD, a mobile phone, etc.
The present application also provides a computer program product adapted to perform, when executed on an electronic device, any of the audio-video transcoding methods initialized with the embodiments provided herein.
The embodiment of the application provides a storage medium, on which a program is stored, which when executed by a processor, implements any one of the audio/video transcoding methods provided in the embodiment of the application.
The embodiment of the application provides a processor, which is used for running a program, wherein any one of the audio and video transcoding methods provided by the embodiment of the application is executed when the program runs.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, the device includes one or more processors (CPUs), memory, and a bus. The device may also include input/output interfaces, network interfaces, and the like.
The memory may include volatile memory, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), in a computer readable storage medium, the memory including at least one memory chip. Memory is an example of a computer-readable medium.
Computer-readable storage media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments in part.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (10)

1. An audio-video transcoding method is characterized by comprising the following steps of:
the method comprises the steps of obtaining playing order information of each audio and video in a plurality of audio and video and category information of each audio and video, wherein the category information is used for indicating categories of sound and/or images output by the audio and video when being played, and the playing order information is used for indicating the playing order of the audio and video in the plurality of audio and video;
determining a code rate corresponding to the category information of the audio and video;
transcoding the plurality of audios and videos respectively according to the determined code rate;
and combining the plurality of audios and videos into one audio and video according to the playing order information.
2. The audio-video transcoding method of claim 1, further comprising:
obtaining a preset importance level of the audio and video according to a preset corresponding relation, wherein the preset importance level and the audio and video have the preset corresponding relation, and the preset importance level is used for indicating the importance level of the audio and video;
The determining the code rate corresponding to the category information of the audio and video comprises the following steps:
and determining code rates corresponding to the category information and the preset importance level of the audio and video.
3. The method for transcoding audio and video according to claim 2, wherein said determining a code rate corresponding to the category information of the audio and video and a preset importance level includes:
and filling the determined category information and the preset importance level into a preset code rate acquisition function as parameters, and operating the preset code rate acquisition function to obtain the code rate corresponding to the category information and the preset importance level of the audio and video.
4. The audio/video transcoding method as claimed in claim 3, wherein the audio/video is video, and the parameters of the preset code rate acquisition function further comprise: at least one of a type of terminal device, a resolution of video, and a coding type of video.
5. The audio-video transcoding method of claim 1, wherein each of said audio-video is a video frame set;
the play order information includes: the range of the frame number of the video frame group in the combined audio and video and/or the range of the display time stamp of the video frame group in the combined audio and video.
6. The audio-video transcoding method of claim 5, further comprising, prior to said obtaining the playback order information for each of the plurality of audio-videos and the category information for each of the plurality of audio-videos:
dividing the audio and video with the number of the contained video frames exceeding the preset number into at least two audio and video, wherein each audio and video obtained after division is a video frame group, and the category information of each audio and video obtained after division is the same as the category information of the audio and video with the number of the contained video frames exceeding the preset number.
7. The audio/video transcoding method as claimed in claim 1, wherein the plurality of audio/videos are audio/videos obtained by dividing at least one audio/video, and category information of the divided audio/videos having a neighboring relationship in the at least one audio/video is different.
8. The audio-video transcoding method of claim 2, further comprising:
transmitting the combined audio and video code stream to a terminal device;
obtaining feedback information of the user of the terminal equipment on the combined audio and video;
adjusting the preset importance level of at least one of the plurality of audios and videos according to the feedback information;
And returning to the step of determining the code rate corresponding to the category information of the audio and video and the preset importance level.
9. An audio-video transcoding device, characterized in that the audio-video transcoding device comprises:
the information obtaining unit is used for obtaining the playing order information of each audio and video in the plurality of audio and video and the category information of each audio and video, wherein the category information is used for indicating the category of sound and/or image output by the audio and video when playing, and the playing order information is used for indicating the playing order of the audio and video in the plurality of audio and video;
the code rate determining unit is used for determining the code rate corresponding to the category information of the audio and video;
the transcoding unit is used for transcoding the plurality of audios and videos respectively according to the determined code rate;
and the merging unit is used for merging the plurality of audios and videos into one audio and video according to the playing order information.
10. An electronic device comprising at least one processor, and at least one memory, bus coupled to the processor;
the processor and the memory complete communication with each other through the bus; the processor is configured to invoke program instructions in the memory to perform the audio-video transcoding method of any of claims 1 to 7.
CN202311865763.8A 2023-12-29 2023-12-29 Audio and video transcoding method and device and electronic equipment Pending CN117812289A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311865763.8A CN117812289A (en) 2023-12-29 2023-12-29 Audio and video transcoding method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311865763.8A CN117812289A (en) 2023-12-29 2023-12-29 Audio and video transcoding method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN117812289A true CN117812289A (en) 2024-04-02

Family

ID=90429652

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311865763.8A Pending CN117812289A (en) 2023-12-29 2023-12-29 Audio and video transcoding method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN117812289A (en)

Similar Documents

Publication Publication Date Title
US11386932B2 (en) Audio modification for adjustable playback rate
CN109963184B (en) Audio and video network playing method and device and electronic equipment
US11972770B2 (en) Systems and methods for intelligent playback
US20200019290A1 (en) Digital Media Player Behavioral Parameter Modification
US8627206B2 (en) Image processing device and image processing method for displaying images in a spiral form
US20030086692A1 (en) Special reproduction control information describing method, special reproduction control information creating apparatus and method therefor, and video reproduction apparatus and method therefor
KR20180100716A (en) Establishing a streaming presentation of an event
WO2018139284A1 (en) Image processing device and method, and program
CN108307248B (en) Video broadcasting method, calculates equipment and storage medium at device
JP2020522193A (en) Temporal placement of rebuffering events
JPWO2018142946A1 (en) Information processing apparatus and method
JP2009194767A (en) Device and method for video evaluation, and video providing device
CN117812289A (en) Audio and video transcoding method and device and electronic equipment
JP4032122B2 (en) Video editing apparatus, video editing program, recording medium, and video editing method
US20190387271A1 (en) Image processing apparatus, image processing method, and program
JP5560999B2 (en) Video / audio recording / reproducing apparatus and video / audio recording / reproducing method
KR102659489B1 (en) Information processing devices, information processing devices and programs
JP2014099766A (en) Communication apparatus
US20220394323A1 (en) Supplmental audio generation system in an audio-only mode
Akgul et al. Automated adaptive playback for encoder-adjudicated live sports
US20240321315A1 (en) Variable speed playback
JP4641023B2 (en) Video signal playback device
JP2006222497A (en) Method, device and program for transforming moving video, and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination