CN115103222A - Video audio track processing method and related equipment - Google Patents

Video audio track processing method and related equipment Download PDF

Info

Publication number
CN115103222A
CN115103222A CN202210722194.0A CN202210722194A CN115103222A CN 115103222 A CN115103222 A CN 115103222A CN 202210722194 A CN202210722194 A CN 202210722194A CN 115103222 A CN115103222 A CN 115103222A
Authority
CN
China
Prior art keywords
video
player
track
highlight
segment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210722194.0A
Other languages
Chinese (zh)
Inventor
郝成
刘广宾
赵文娴
李尧彦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan MgtvCom Interactive Entertainment Media Co Ltd
Original Assignee
Hunan MgtvCom Interactive Entertainment Media Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan MgtvCom Interactive Entertainment Media Co Ltd filed Critical Hunan MgtvCom Interactive Entertainment Media Co Ltd
Priority to CN202210722194.0A priority Critical patent/CN115103222A/en
Publication of CN115103222A publication Critical patent/CN115103222A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8106Monomedia components thereof involving special audio data, e.g. different tracks for different languages

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Management Or Editing Of Information On Record Carriers (AREA)

Abstract

The video track processing method and the related device provided by the disclosure can respond to the triggering operation of a user for playing a first video on a client to obtain a video identifier of the first video, wherein the client is configured with a first player and a second player; detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in a first video; detecting whether a first audio track identification corresponding to the first segment identification exists or not, and if so, obtaining a first commentary audio track corresponding to the first audio track identification; and controlling the first player to jump to play the first wonderful segment of the first video and controlling the second player to play the first commentary audio track by using the first segment position. According to the method, the two players are used for simultaneously playing the highlight segments and the commentary audio tracks of the videos, video files do not need to be modified, and the storage pressure of a content distribution network is relieved while the audio-visual atmosphere is improved.

Description

Video audio track processing method and related equipment
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a video track processing method and related devices.
Background
With the development of multimedia technology and the popularization of intelligent equipment, users can enjoy high-quality audio-visual experience by using corresponding multimedia service clients on the intelligent equipment.
Currently, if the commentary track is to be played when the highlight segment of the video is played, the commentary track needs to be merged with the video, the video file with the merged commentary track is transcoded, and the transcoded video file is deployed to a Content Delivery Network (CDN), so that the intelligent device is selected by a user or selected by default to play the commentary track when the highlight segment of the video is played.
However, merging the commentary track with the video is equivalent to modifying the video file, in which process the integrity of the video file is easily destroyed. Meanwhile, the content distribution network needs to store the videos before and after merging, so that the storage pressure of the content distribution network is increased.
Therefore, how to play the commentary audio track under the condition of playing the wonderful segment of the video without modifying the video media file becomes a technical problem which needs to be solved urgently by the technical personnel in the field.
Disclosure of Invention
In view of the above problems, the present disclosure provides a video and audio track processing method and related apparatus, which overcome or at least partially solve the above problems, and the technical solutions are as follows:
a video soundtrack processing method comprising:
responding to a triggering operation of a user for playing a first video on a client, and obtaining a video identifier of the first video, wherein the client is configured with a first player and a second player;
detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in the first video;
detecting whether a first audio track identifier corresponding to the first segment identifier exists or not, and if so, obtaining a first commentary audio track corresponding to the first audio track identifier;
and controlling the first player to jump to play the first wonderful section of the first video and controlling the second player to play the first commentary track by using the first section position.
Optionally, the method further includes:
controlling the first player to stop playing an original video track corresponding to the first video if the second player plays the first commentary track.
Optionally, before the triggering operation responding to the user playing the first video on the client, the method further includes:
determining the first highlight segment in the first video and generating the first highlight segment information corresponding to the first highlight segment;
constructing a corresponding relation between the first highlight segment information and the video identification of the first video;
obtaining a first narration text corresponding to the first highlight segment;
converting the first narration text into the first narration track using a preset timbre;
constructing a correspondence of the first track identification and the first segment identification of the first commentary track.
Optionally, the method further includes:
after the second player finishes playing the first commentary track, controlling the first player to continue playing the first video and the original video track.
Optionally, the method further includes:
after the second player finishes playing the first commentary audio track, detecting second highlight information corresponding to the video identifier, wherein the second highlight information comprises a second segment identifier corresponding to a second highlight in the first video and a second segment position, and the second segment position is behind the first segment position;
detecting whether a second audio track identification corresponding to the second clip identification exists, and if so, obtaining a second commentary audio track corresponding to the second audio track identification;
and controlling the first player to jump to play the second wonderful section of the first video and controlling the second player to play the second commentary audio track by using the second section position.
Optionally, the method further includes:
performing network buffer monitoring on the first player and the second player;
controlling the first player to stop playing the first highlight and controlling the second player to stop playing the first commentary track when the playable data cached by the first player for the first highlight is smaller than a first preset threshold; if the first playable data is not less than the first preset threshold, controlling the first player to continue playing the first highlight video and controlling the second player to continue playing the first commentary track;
and/or, in the case that second playable data cached by the second player for the first commentary track is smaller than a second preset threshold, controlling the second player to stop playing the first commentary track, and controlling the first player to stop playing the first highlight; and if the second playable data is not less than the second preset threshold, controlling the second player to continue playing the first commentary track, and controlling the second player to continue playing the first highlight video.
Optionally, in a case where the second player plays the first commentary track, controlling the first player to stop playing an original video track corresponding to the first video includes:
controlling the first player not to output audio data to an audio playing component of the first player to stop playing the original video track after decoding audio data of the original video track corresponding to the first video in the case that the second player plays the first commentary track.
A video audio track processing apparatus comprising: a first obtaining unit, a first detecting unit, a second obtaining unit and a first playing unit,
the first obtaining unit is configured to obtain a video identifier of a first video in response to a trigger operation of a user for playing the first video on a client, where the client is configured with a first player and a second player;
the first detection unit is used for detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in the first video;
the second detection unit is used for detecting whether a first audio track identifier corresponding to the first segment identifier exists or not, and if so, the second obtaining unit is triggered;
the second obtaining unit is configured to obtain a first commentary track corresponding to the first track identification;
the first playing unit is configured to control the first player to skip to play the first highlight of the first video and control the second player to play the first commentary track by using the first clip position.
A computer-readable storage medium on which a program is stored, which when executed by a processor implements a video track processing method as in any one of the above.
An electronic device comprising at least one processor, and at least one memory connected to the processor, a bus; the processor and the memory complete mutual communication through the bus; the processor is configured to invoke program instructions in the memory to perform any of the video track processing methods described above.
By means of the technical scheme, the video track processing method and the related device provided by the disclosure can respond to the triggering operation of a user for playing a first video on a client to obtain the video identifier of the first video, and the client is provided with a first player and a second player; detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in a first video; detecting whether a first audio track identifier corresponding to the first segment identifier exists, and if so, obtaining a first commentary audio track corresponding to the first audio track identifier; and controlling the first player to skip to play a first highlight clip of the first video and controlling the second player to play a first commentary audio track by using the first clip position. According to the video playing method and device, the wonderful section of the video and the explaining audio track are played simultaneously through the two players, video files do not need to be modified, the audio-visual atmosphere is improved when a user watches the wonderful section of the video, and the storage pressure of a content distribution network is relieved.
The foregoing description is only an overview of the technical solutions of the present disclosure, and the embodiments of the present disclosure are described below in order to make the technical means of the present disclosure more clearly understood and to make the above and other objects, features, and advantages of the present disclosure more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the disclosure. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic flow chart diagram illustrating an implementation manner of a video audio track processing method provided by an embodiment of the present disclosure;
fig. 2 is a schematic flow chart diagram illustrating another implementation of a video audio track processing method provided by an embodiment of the present disclosure;
fig. 3 is an explanatory diagram illustrating an overall process of a video audio track processing method provided by an embodiment of the present disclosure;
fig. 4 is an explanatory diagram illustrating a playing process of the video track processing method provided by the embodiment of the disclosure;
fig. 5 is a schematic structural diagram of a video audio track processing apparatus provided by an embodiment of the present disclosure;
fig. 6 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, a flowchart of an implementation manner of a video audio track processing method provided by an embodiment of the present disclosure may include:
s100, responding to a trigger operation of a user for playing a first video on a client, and obtaining a video identifier of the first video, wherein the client is configured with a first player and a second player.
The client may be an application program providing multimedia services for the user. Optionally, the trigger operation may be a click operation. The user can click a play button of the first video on the client and output a play instruction of the first video. After receiving the play instruction, the embodiment of the disclosure obtains the video identifier of the first video.
Alternatively, the video identification may be a unique asset number of the first video. Alternatively, the video identification may be a 32-bit string.
The first player is used for playing the first video. The second player is for playing the narration track.
S200, first highlight information corresponding to the video identification is detected, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in the first video.
The first highlight segment may be a video segment determined in advance in the first video by an artificial or preset highlight segment determination model. It is understood that the disclosed embodiments may generate corresponding first highlight information for the first highlight in advance.
Alternatively, the first segment identifier may be a unique video segment number for the first highlight segment. Alternatively, the first segment identifier may be a 32-bit string.
Wherein the first segment position is used to indicate a position of the first highlight segment in the first video. Optionally, the first segment position may be a play start time of the first highlight segment in the first video.
Optionally, in the embodiment of the present disclosure, a corresponding relationship between the video identifier and the first highlight information may be pre-constructed, and the first highlight information corresponding to the video identifier may be queried by detecting the video identifier.
Optionally, the embodiment of the present disclosure may perform normal playing on the first video under the condition that the first highlight information corresponding to the video identifier is not detected.
Optionally, based on the method shown in fig. 1, as shown in fig. 2, a flowchart of another implementation manner of the video track processing method provided in the embodiment of the present disclosure is shown, before step S100, the video track processing method may further include:
a100, determining a first highlight segment in a first video, and generating first highlight segment information corresponding to the first highlight segment.
Optionally, any piece of highlight information may be as shown in table 1, and the highlight information may include a piece identifier, a piece title, a narration text, and a piece position.
TABLE 1
Segment identification Segment title Narration text Segment position (unit: second)
32-bit string Fun and fun *** 361
For example: the highlight information may be:
“video_wonderful_info{
media_uuid:adbxxxxxxx,
wonderful _ text content of the highlight video narrative track
seek_point:2800
}”。
And A200, constructing a corresponding relation between the first highlight segment information and the video identification of the first video.
Specifically, the embodiment of the present disclosure may construct a corresponding relationship between the segment identifier in the first highlight segment information and the video identifier of the first video. Alternatively, the correspondence between the segment identifier and the video identifier may be as shown in table 2. According to the embodiment of the disclosure, by detecting the first video identifier, the first segment identifier corresponding to the first video identifier can be detected, so as to obtain the first highlight segment information corresponding to the first segment identifier.
TABLE 2
Video identification File name Segment identification
32-bit string Happy book of great Ben Ying 32-bit string
And A300, obtaining a first narration text corresponding to the first highlight.
Optionally, the comment text may be a comment text edited by a technician in advance for the highlight, or may be a comment text collected and identified by the user corresponding to the highlight.
The text content of the highlight audio track of the highlight video under the field 'the comment text' can be obtained from the first highlight information corresponding to the first highlight.
And A400, converting the first narration text into a first narration audio track by using a preset tone.
Optionally, the embodiment of the present disclosure may convert the narration text into the narration audio track by using a pre-selected timbre through an online mature AI audio platform, and store the narration audio track.
And A500, constructing the corresponding relation between the first audio track identification and the first segment identification of the first commentary audio track.
Alternatively, the first track identification may be a unique track number of the first commentary track. Alternatively, the first track identification may be a 32-bit string.
Alternatively, the correspondence between the track id and the segment id may be as shown in table 3. The disclosed embodiment can detect a first audio track identifier corresponding to the first segment identifier by detecting the first segment identifier, thereby obtaining a first commentary audio track corresponding to the first audio track identifier.
TABLE 3
Audio track identification Length of time (unit: second) Segment identification
32-bit string 90 32-bit string
According to the embodiment of the disclosure, by constructing the corresponding relation among the video identification, the segment identification and the audio track identification, the commentary audio track with rich styles can be generated in a user-defined mode based on the highlight segment information, and the commentary audio track is matched with the highlight segment, so that the audio-visual experience of a user is improved.
S300, whether a first audio track identifier corresponding to the first segment identifier exists is detected, and if the first audio track identifier exists, the step S400 is executed.
Optionally, in the embodiment of the present disclosure, when it is detected that there is no first audio track identifier corresponding to the first segment identifier, the first player may be directly controlled to play the first highlight segment, and the second player is not controlled at all.
S400, obtaining a first commentary audio track corresponding to the first audio track identification.
S500, controlling the first player to jump to play the first wonderful clip of the first video and controlling the second player to play the first commentary audio track by using the first clip position.
Specifically, the control client starts two multimedia play objects (Media Player): the first player and the second player are used for positioning the first segment position of the first video, directly jumping to the first segment position and playing the first highlight segment. In the event that the presence of a first narration track corresponding to the first highlight is detected, cooperatively controlling a second player to play the first narration track.
Optionally, the embodiment of the disclosure may control the second player to play the first commentary track at a preset volume.
Optionally, the embodiment of the present disclosure may control the first player to stop playing the original video track corresponding to the first video when the second player plays the first commentary track.
Specifically, the embodiment of the present disclosure may control the first player not to output the audio data to an audio playing component (audio) of the first player after decoding out the audio data of the original video track corresponding to the first video, so as to stop playing the original video track, in a case where the second player plays the first commentary track.
Optionally, the first player has a function of discarding the audio data. The embodiment of the disclosure may control the first player to perform audio and video synchronization operation (AVSYNC) on the decoded audio of the original video track corresponding to the first video when the second player plays the first commentary track, discard the audio data buffered by the decoded audio, and not output the audio data to the audio playing component of the first player.
Optionally, the embodiment of the present disclosure may control the first player to continue playing the first video and the original video track after the second player finishes playing the first commentary track.
The disclosed embodiments may synchronously update the state after the second player finishes playing the first narration audio track, so that the first player outputs the audio data of the decoded audio buffer of the original video audio track corresponding to the first video to the audio playing component, so that the first player resumes normal playing of the first video.
Alternatively, the commentary tracks and videos may be stored separately in different Content Delivery Networks (CDNs).
In order to facilitate understanding of the overall process of the video track processing method provided by the embodiment of the present disclosure, the following description is made with reference to fig. 3: as shown in fig. 3, highlight information can be edited manually from the media assets of the video, and an commentary track can be intelligently generated from the highlight information. When a user selects to play a video, detecting a video identifier of the video to obtain corresponding highlight information, fast forwarding to a corresponding highlight, judging whether a corresponding explanation audio track exists according to the highlight identifier, if so, pulling the explanation audio track from the explanation audio track CDN to a second player, and sending a mute notification to a first player so that the first player plays the video from the highlight after obtaining the video from the video CDN, and continuing to play the video conventionally after the highlight is played. According to the video transcoding method and device, the video is not required to be transcoded again, the video file is prevented from being damaged by transcoding, and the storage resources of the CDN are saved.
In order to facilitate understanding of the playing process of the video track processing method provided by the embodiment of the present disclosure, the following description is made with reference to fig. 4: as shown in fig. 4, the embodiment of the present disclosure provides a play state management component at the client for managing the play states of the first player and the second player. First a first player and a second player are created separately. The first player separates the audio stream and the video stream of the video to obtain audio data and video data of the video. And after decoding the audio data, the first player judges whether a comment sound track exists, if so, discards the audio data, and if not, outputs the audio data. The first player outputs the video data after decoding the video data. The first player may query the play state of the second player through the play state management component. And after the second player pulls the audio data of the comment audio track, the second player decodes the audio data and plays the audio data according to the preset volume until the comment audio track is played. The second player may update its own play status to the play management component.
Optionally, in the embodiment of the present disclosure, after the second player finishes playing the first narration track, second highlight information corresponding to the video identifier may be detected, where the second highlight information includes a second clip identifier and a second clip position corresponding to a second highlight in the first video, where the second clip position is after the first clip position; detecting whether a second audio track identification corresponding to the second clip identification exists, and if so, obtaining a second commentary audio track corresponding to the second audio track identification; and controlling the first player to jump to play a second wonderful segment of the first video and controlling the second player to play a second commentary audio track by using the second segment position.
It is understood that there may be multiple highlights in a video. The embodiment of the disclosure can detect another highlight in the video after the end of playing the commentary audio track corresponding to the highlight, and play the highlight and the commentary audio track corresponding to the highlight. The embodiment of the disclosure can provide rich audio-visual experience for the user by continuously playing the highlight segments of the video and the corresponding commentary audio tracks.
Optionally, the embodiment of the present disclosure may further perform network buffer monitoring on the first player and the second player. Specifically, in the embodiment of the present disclosure, the play state management component at the client sets a monitor (buffer back) to the first player and the second player, and when a network jitter occurs in any player, the monitor (buffer back) is called back to the play state management component, and the play state management component cooperates with the play state of another player to notify the another player to change the play state when the player recovers the buffering. Since the commentary track does not need to be actually matched with the mouth shape of the video picture in the highlight, the sound-picture synchronization process of the PTS (Presentation Time Stamp) does not need to be performed.
Optionally, when the first playable data cached by the first player for the first highlight is smaller than a first preset threshold, controlling the first player to stop playing the first highlight, and controlling the second player to stop playing the first commentary track; and under the condition that the first playable data is not less than a first preset threshold value, controlling the first player to continuously play the first highlight video and controlling the second player to continuously play the first commentary audio track.
Optionally, when second playable data cached by the second player for the first commentary audio track is smaller than a second preset threshold, the second player is controlled to stop playing the first commentary audio track, and the first player is controlled to stop playing the first highlight; and under the condition that the second playable data is not less than a second preset threshold value, controlling the second player to continue playing the first commentary audio track, and controlling the second player to continue playing the first highlight video.
The playing and ending and synchronous buffer states of the two players can be controlled through the playing state management component, and cooperative control over the two players is achieved.
The video track processing method provided by the present disclosure can respond to a trigger operation of a user for playing a first video on a client, and obtain a video identifier of the first video, wherein the client is configured with a first player and a second player; detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in a first video; detecting whether a first audio track identifier corresponding to the first segment identifier exists, and if so, obtaining a first commentary audio track corresponding to the first audio track identifier; and controlling the first player to jump to play the first wonderful segment of the first video and controlling the second player to play the first commentary audio track by using the first segment position. According to the video playing method and device, the wonderful section of the video and the explaining audio track are played simultaneously through the two players, video files do not need to be modified, the audio-visual atmosphere is improved when a user watches the wonderful section of the video, and the storage pressure of a content distribution network is relieved.
Although the operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
Corresponding to the above method embodiment, an embodiment of the present disclosure further provides a video track processing apparatus, whose structure is shown in fig. 5, and may include: a first obtaining unit 100, a first detecting unit 200, a second detecting unit 300, a second obtaining unit 400, and a first playing unit 500.
A first obtaining unit 100, configured to obtain a video identifier of a first video in response to a trigger operation of a user to play the first video on a client, where the client is configured with a first player and a second player.
The first detecting unit 200 is configured to detect first highlight information corresponding to a video identifier, where the first highlight information includes a first segment identifier and a first segment position corresponding to a first highlight in a first video.
A second detection unit 300, configured to detect whether a first audio track identifier corresponding to the first segment identifier exists, and if so, trigger the second obtaining unit 400.
A second obtaining unit 400 is configured to obtain a first commentary track corresponding to the first track identification.
The first playing unit 500 is configured to control the first player to skip to play the first highlight of the first video and control the second player to play the first commentary track by using the first clip position.
Optionally, the video track processing apparatus may further include: a first stopping unit.
And the first stopping unit is used for controlling the first player to stop playing the original video track corresponding to the first video under the condition that the second player plays the first commentary track.
Optionally, the video track processing apparatus may further include: the device comprises a first determining unit, a first constructing unit, a third obtaining unit, a first converting unit and a second constructing unit.
The first obtaining unit 100 is configured to determine a first highlight in the first video before responding to a trigger operation of a user for playing the first video on the client, and generate first highlight information corresponding to the first highlight.
And the first construction unit is used for constructing the corresponding relation between the first highlight segment information and the video identification of the first video.
A third obtaining unit, configured to obtain a first narrative text corresponding to the first highlight.
A first conversion unit, configured to convert the first narration text into a first narration track using a preset timbre.
And the second construction unit is used for constructing the corresponding relation between the first audio track identifier and the first segment identifier of the first commentary audio track.
Optionally, the video track processing apparatus may further include: a second playback unit.
And the second playing unit is used for controlling the first player to continue playing the first video and the original video track after the second player finishes playing the first commentary track.
Optionally, the video track processing apparatus may further include: the device comprises a third detection unit, a fourth acquisition unit and a third playing unit.
A third detecting unit, configured to detect second highlight information corresponding to the video identifier after the second player finishes playing the first commentary audio track, where the second highlight information includes a second clip identifier and a second clip position that correspond to a second highlight in the first video, and the second clip position is after the first clip position.
And the fourth detection unit is used for detecting whether a second audio track identifier corresponding to the second segment identifier exists or not, and if so, triggering the fourth obtaining unit.
A fourth obtaining unit configured to obtain a second commentary track corresponding to the second track identification.
And the third playing unit is used for controlling the first player to skip to play a second wonderful segment of the first video and controlling the second player to play a second commentary audio track by using the second segment position.
Optionally, the video track processing apparatus may further include: the device comprises a first monitoring unit, a first control unit and/or a second control unit.
The first monitoring unit is used for carrying out network buffer monitoring on the first player and the second player;
the first control unit is used for controlling the first player to stop playing the first highlight segment and controlling the second player to stop playing the first commentary track under the condition that the first playable data cached by the first player on the first highlight segment is smaller than a first preset threshold; and under the condition that the first playable data is not less than the first preset threshold value, controlling the first player to continue playing the first highlight video and controlling the second player to continue playing the first commentary audio track.
The second control unit is used for controlling the second player to stop playing the first commentary audio track and controlling the first player to stop playing the first wonderful section under the condition that second playable data cached by the second player on the first commentary audio track is smaller than a second preset threshold value; and under the condition that the second playable data is not less than a second preset threshold value, controlling the second player to continue playing the first commentary audio track, and controlling the second player to continue playing the first highlight video.
Optionally, the first stopping unit is specifically configured to, when the second player plays the first commentary track, control the first player to not output the audio data to the audio playing component of the first player after decoding the audio data of the original video track corresponding to the first video, so as to stop playing the original video track.
The video track processing device provided by the disclosure can respond to the triggering operation of playing a first video on a client by a user to obtain a video identifier of the first video, wherein the client is provided with a first player and a second player; detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in a first video; detecting whether a first audio track identification corresponding to the first segment identification exists or not, and if so, obtaining a first commentary audio track corresponding to the first audio track identification; and controlling the first player to jump to play the first wonderful segment of the first video and controlling the second player to play the first commentary audio track by using the first segment position. According to the video playing method and device, the wonderful section of the video and the explaining audio track are played simultaneously through the two players, video files do not need to be modified, the audio-visual atmosphere is improved when a user watches the wonderful section of the video, and the storage pressure of a content distribution network is relieved.
With regard to the apparatus in the above-described embodiment, the specific manner in which each unit performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.
The video track processing device comprises a processor and a memory, wherein the first obtaining unit 100, the first detecting unit 200, the second detecting unit 300, the second obtaining unit 400, the first playing unit 500 and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, the highlight segments of the videos and the explaining audio tracks are played simultaneously through the two players by adjusting the kernel parameters, the video files do not need to be modified, and the storage pressure of a content distribution network is relieved while the audio-visual atmosphere is improved.
The disclosed embodiments provide a computer-readable storage medium having stored thereon a program that, when executed by a processor, implements the video track processing method.
The disclosed embodiment provides a processor for executing a program, wherein the program executes to execute the video track processing method.
As shown in fig. 6, an embodiment of the present disclosure provides an electronic device 1000, where the electronic device 1000 includes at least one processor 1001, and at least one memory 1002 and a bus 1003 connected to the processor 1001; the processor 1001 and the memory 1002 complete communication with each other through the bus 1003; the processor 1001 is used to call program instructions in the memory 1002 to perform the above-described video track processing method. The electronic device herein may be a server, a PC, a PAD, a mobile phone, etc.
The present disclosure also provides a computer program product adapted to perform a program initialized with video soundtrack processing method steps when executed on an electronic device.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, electronic devices (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, an electronic device includes one or more processors (CPUs), memory, and a bus. The electronic device may also include input/output interfaces, network interfaces, and the like.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), including at least one memory chip. The memory is an example of a computer-readable medium.
Computer-readable media, including both permanent and non-permanent, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
In the description of the present disclosure, it is to be understood that the directions or positional relationships indicated as referring to the terms "upper", "lower", "front", "rear", "left" and "right", etc., are based on the directions or positional relationships shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the positions or elements referred to must have specific directions, be constituted and operated in specific directions, and thus, are not to be construed as limitations of the present disclosure.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The above are merely examples of the present disclosure, and are not intended to limit the present disclosure. Various modifications and variations of this disclosure will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the scope of the claims of the present disclosure.

Claims (10)

1. A method of processing a video soundtrack, comprising:
responding to a triggering operation of a user for playing a first video on a client, and obtaining a video identifier of the first video, wherein the client is configured with a first player and a second player;
detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in the first video;
detecting whether a first audio track identifier corresponding to the first segment identifier exists or not, and if so, obtaining a first commentary audio track corresponding to the first audio track identifier;
and controlling the first player to jump to play the first wonderful section of the first video and controlling the second player to play the first commentary track by using the first section position.
2. The method of claim 1, further comprising:
controlling the first player to stop playing an original video track corresponding to the first video if the second player plays the first commentary track.
3. The method of claim 1, wherein prior to the triggering operation in response to the user playing the first video on the client, the method further comprises:
determining the first highlight segment in the first video and generating the first highlight segment information corresponding to the first highlight segment;
constructing a corresponding relation between the first highlight segment information and the video identification of the first video;
obtaining a first narration text corresponding to the first highlight segment;
converting the first narration text into the first narration audio track by using a preset tone color;
constructing a correspondence of the first audio track identification and the first segment identification of the first commentary audio track.
4. The method of claim 2, further comprising:
after the second player finishes playing the first commentary track, controlling the first player to continue playing the first video and the original video track.
5. The method of claim 1, further comprising:
after the second player finishes playing the first commentary audio track, detecting second highlight information corresponding to the video identifier, wherein the second highlight information comprises a second segment identifier corresponding to a second highlight in the first video and a second segment position, and the second segment position is behind the first segment position;
detecting whether a second audio track identifier corresponding to the second clip identifier exists, and if so, obtaining a second commentary audio track corresponding to the second audio track identifier;
and controlling the first player to jump to play the second wonderful section of the first video and controlling the second player to play the second commentary audio track by using the second section position.
6. The method of claim 1, further comprising:
performing network buffer monitoring on the first player and the second player;
controlling the first player to stop playing the first highlight and controlling the second player to stop playing the first commentary track when the playable data cached by the first player for the first highlight is smaller than a first preset threshold; if the first playable data is not less than the first preset threshold, controlling the first player to continue playing the first highlight video and controlling the second player to continue playing the first commentary track;
and/or, in the case that second playable data cached by the second player for the first commentary track is smaller than a second preset threshold, controlling the second player to stop playing the first commentary track, and controlling the first player to stop playing the first highlight; and if the second playable data is not less than the second preset threshold, controlling the second player to continue playing the first commentary track, and controlling the second player to continue playing the first highlight video.
7. The method of claim 2, wherein the controlling the first player to stop playing an original video track corresponding to the first video if the second player plays the first commentary track comprises:
controlling the first player not to output audio data to an audio playing component of the first player to stop playing the original video track after decoding audio data of the original video track corresponding to the first video in the case that the second player plays the first commentary track.
8. A video soundtrack processing apparatus, comprising: a first obtaining unit, a first detecting unit, a second obtaining unit and a first playing unit,
the first obtaining unit is configured to obtain a video identifier of a first video in response to a trigger operation of a user for playing the first video on a client, where the client is configured with a first player and a second player;
the first detection unit is used for detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in the first video;
the second detection unit is used for detecting whether a first audio track identifier corresponding to the first segment identifier exists or not, and if so, the second acquisition unit is triggered;
the second obtaining unit is configured to obtain a first commentary track corresponding to the first track identification;
the first playing unit is configured to control the first player to skip to play the first highlight of the first video and control the second player to play the first commentary track by using the first clip position.
9. A computer-readable storage medium on which a program is stored, the program, when being executed by a processor, implementing a video audio track processing method according to any one of claims 1 to 7.
10. An electronic device comprising at least one processor, and at least one memory connected to the processor, a bus; the processor and the memory are communicated with each other through the bus; the processor is configured to invoke program instructions in the memory to perform the video track processing method of any of claims 1 to 7.
CN202210722194.0A 2022-06-24 2022-06-24 Video audio track processing method and related equipment Pending CN115103222A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210722194.0A CN115103222A (en) 2022-06-24 2022-06-24 Video audio track processing method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210722194.0A CN115103222A (en) 2022-06-24 2022-06-24 Video audio track processing method and related equipment

Publications (1)

Publication Number Publication Date
CN115103222A true CN115103222A (en) 2022-09-23

Family

ID=83293872

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210722194.0A Pending CN115103222A (en) 2022-06-24 2022-06-24 Video audio track processing method and related equipment

Country Status (1)

Country Link
CN (1) CN115103222A (en)

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002315964A (en) * 2001-04-20 2002-10-29 Square Co Ltd Computer readable recording medium with program of video game recorded thereon, program of video game, video game processing method, and video game processor
CN1759909A (en) * 2004-09-15 2006-04-19 微软公司 Online gaming spectator system
CN1781305A (en) * 2003-04-30 2006-05-31 皇家飞利浦电子股份有限公司 Video language filtering based on user profile
CN101124561A (en) * 2003-12-08 2008-02-13 Divx公司 Multimedia distribution system
US20080134260A1 (en) * 2006-12-04 2008-06-05 Qualcomm Incorporated Systems, methods and apparatus for providing sequences of media segments and corresponding interactive data on a channel in a media distribution system
CN107148781A (en) * 2014-10-09 2017-09-08 图兹公司 Produce the customization bloom sequence for describing one or more events
CN107707931A (en) * 2016-08-08 2018-02-16 阿里巴巴集团控股有限公司 Generated according to video data and explain data, data synthesis method and device, electronic equipment
CN108140056A (en) * 2016-01-25 2018-06-08 谷歌有限责任公司 Media program moment guide
CN109618184A (en) * 2018-12-29 2019-04-12 北京市商汤科技开发有限公司 Method for processing video frequency and device, electronic equipment and storage medium
CN110933459A (en) * 2019-11-18 2020-03-27 咪咕视讯科技有限公司 Event video clipping method, device, server and readable storage medium
CN111246283A (en) * 2020-01-17 2020-06-05 北京达佳互联信息技术有限公司 Video playing method and device, electronic equipment and storage medium
CN111953910A (en) * 2020-08-11 2020-11-17 腾讯科技(深圳)有限公司 Video processing method and device based on artificial intelligence and electronic equipment
WO2020231528A1 (en) * 2019-05-14 2020-11-19 Microsoft Technology Licensing, Llc Dynamic video highlight
CN112165648A (en) * 2020-10-19 2021-01-01 腾讯科技(深圳)有限公司 Audio playing method, related device, equipment and storage medium
CN112203116A (en) * 2019-07-08 2021-01-08 腾讯科技(深圳)有限公司 Video generation method, video playing method and related equipment
CN112328834A (en) * 2020-11-10 2021-02-05 北京小米移动软件有限公司 Video association method and device, electronic equipment and storage medium
CN112599144A (en) * 2020-12-03 2021-04-02 Oppo(重庆)智能科技有限公司 Audio data processing method, audio data processing apparatus, medium, and electronic device
CN113329235A (en) * 2021-05-31 2021-08-31 太仓韬信信息科技有限公司 Audio processing method and device and cloud server
CN113630630A (en) * 2021-08-09 2021-11-09 咪咕数字传媒有限公司 Method, device and equipment for processing dubbing information of video commentary
CN113796090A (en) * 2019-05-10 2021-12-14 电影音频私人有限公司 System and method for synchronizing audio content on a mobile device to a separate visual display system

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002315964A (en) * 2001-04-20 2002-10-29 Square Co Ltd Computer readable recording medium with program of video game recorded thereon, program of video game, video game processing method, and video game processor
CN1781305A (en) * 2003-04-30 2006-05-31 皇家飞利浦电子股份有限公司 Video language filtering based on user profile
CN101124561A (en) * 2003-12-08 2008-02-13 Divx公司 Multimedia distribution system
CN1759909A (en) * 2004-09-15 2006-04-19 微软公司 Online gaming spectator system
US20080134260A1 (en) * 2006-12-04 2008-06-05 Qualcomm Incorporated Systems, methods and apparatus for providing sequences of media segments and corresponding interactive data on a channel in a media distribution system
CN107148781A (en) * 2014-10-09 2017-09-08 图兹公司 Produce the customization bloom sequence for describing one or more events
CN108140056A (en) * 2016-01-25 2018-06-08 谷歌有限责任公司 Media program moment guide
CN107707931A (en) * 2016-08-08 2018-02-16 阿里巴巴集团控股有限公司 Generated according to video data and explain data, data synthesis method and device, electronic equipment
CN109618184A (en) * 2018-12-29 2019-04-12 北京市商汤科技开发有限公司 Method for processing video frequency and device, electronic equipment and storage medium
CN113796090A (en) * 2019-05-10 2021-12-14 电影音频私人有限公司 System and method for synchronizing audio content on a mobile device to a separate visual display system
WO2020231528A1 (en) * 2019-05-14 2020-11-19 Microsoft Technology Licensing, Llc Dynamic video highlight
CN112203116A (en) * 2019-07-08 2021-01-08 腾讯科技(深圳)有限公司 Video generation method, video playing method and related equipment
CN110933459A (en) * 2019-11-18 2020-03-27 咪咕视讯科技有限公司 Event video clipping method, device, server and readable storage medium
CN111246283A (en) * 2020-01-17 2020-06-05 北京达佳互联信息技术有限公司 Video playing method and device, electronic equipment and storage medium
CN111953910A (en) * 2020-08-11 2020-11-17 腾讯科技(深圳)有限公司 Video processing method and device based on artificial intelligence and electronic equipment
CN112165648A (en) * 2020-10-19 2021-01-01 腾讯科技(深圳)有限公司 Audio playing method, related device, equipment and storage medium
CN112328834A (en) * 2020-11-10 2021-02-05 北京小米移动软件有限公司 Video association method and device, electronic equipment and storage medium
CN112599144A (en) * 2020-12-03 2021-04-02 Oppo(重庆)智能科技有限公司 Audio data processing method, audio data processing apparatus, medium, and electronic device
CN113329235A (en) * 2021-05-31 2021-08-31 太仓韬信信息科技有限公司 Audio processing method and device and cloud server
CN113630630A (en) * 2021-08-09 2021-11-09 咪咕数字传媒有限公司 Method, device and equipment for processing dubbing information of video commentary

Similar Documents

Publication Publication Date Title
KR101246976B1 (en) Aspects of media content rendering
KR101201000B1 (en) Media foundation media processor
US7861150B2 (en) Timing aspects of media content rendering
JP4551668B2 (en) Minute file generation method, minutes file management method, conference server, and network conference system
US20060236219A1 (en) Media timeline processing infrastructure
CN106155470B (en) A kind of audio file generation method and device
KR101518294B1 (en) Media Recorded with Multi-Track Media File, Method and Apparatus for Editing Multi-Track Media File
US9251256B2 (en) System and method for maintaining cue point data structure independent of recorded time-varying content
CN100484227C (en) Video reproduction apparatus and intelligent skip method therefor
JP2016072858A (en) Media data generation method, media data reproduction method, media data generation device, media data reproduction device, computer readable recording medium and program
WO2012092901A2 (en) Media storage system and method
CN115103222A (en) Video audio track processing method and related equipment
US9685190B1 (en) Content sharing
CN114025229A (en) Method and device for processing audio and video files, computing equipment and storage medium
KR100991264B1 (en) Method and system for playing and sharing music sources on an electric device
JP2021067845A (en) Voice reproduction system and program
WO2006030995A9 (en) Index-based authoring and editing system for video contents
US20220394323A1 (en) Supplmental audio generation system in an audio-only mode
US20160364253A1 (en) Method for dynamic multimedia playback processing
CN116723356A (en) Terminal multimedia data processing method, device, computer equipment and storage medium
JP2009026236A (en) Information processor and program
KR20080104406A (en) Method for providing video service and system thereof
Poole Harnessing the Power of Quick Time
JP2004310330A (en) Program, and method and device therefor
KR20110129118A (en) Apparatus for richmedia messaging service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination