CN115103222A

CN115103222A - Video audio track processing method and related equipment

Info

Publication number: CN115103222A
Application number: CN202210722194.0A
Authority: CN
Inventors: 郝成; 刘广宾; 赵文娴; 李尧彦
Original assignee: Hunan MgtvCom Interactive Entertainment Media Co Ltd
Current assignee: Hunan MgtvCom Interactive Entertainment Media Co Ltd
Priority date: 2022-06-24
Filing date: 2022-06-24
Publication date: 2022-09-23

Abstract

The video track processing method and the related device provided by the disclosure can respond to the triggering operation of a user for playing a first video on a client to obtain a video identifier of the first video, wherein the client is configured with a first player and a second player; detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in a first video; detecting whether a first audio track identification corresponding to the first segment identification exists or not, and if so, obtaining a first commentary audio track corresponding to the first audio track identification; and controlling the first player to jump to play the first wonderful segment of the first video and controlling the second player to play the first commentary audio track by using the first segment position. According to the method, the two players are used for simultaneously playing the highlight segments and the commentary audio tracks of the videos, video files do not need to be modified, and the storage pressure of a content distribution network is relieved while the audio-visual atmosphere is improved.

Description

Video audio track processing method and related equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a video track processing method and related devices.

Background

With the development of multimedia technology and the popularization of intelligent equipment, users can enjoy high-quality audio-visual experience by using corresponding multimedia service clients on the intelligent equipment.

Currently, if the commentary track is to be played when the highlight segment of the video is played, the commentary track needs to be merged with the video, the video file with the merged commentary track is transcoded, and the transcoded video file is deployed to a Content Delivery Network (CDN), so that the intelligent device is selected by a user or selected by default to play the commentary track when the highlight segment of the video is played.

However, merging the commentary track with the video is equivalent to modifying the video file, in which process the integrity of the video file is easily destroyed. Meanwhile, the content distribution network needs to store the videos before and after merging, so that the storage pressure of the content distribution network is increased.

Therefore, how to play the commentary audio track under the condition of playing the wonderful segment of the video without modifying the video media file becomes a technical problem which needs to be solved urgently by the technical personnel in the field.

Disclosure of Invention

In view of the above problems, the present disclosure provides a video and audio track processing method and related apparatus, which overcome or at least partially solve the above problems, and the technical solutions are as follows:

a video soundtrack processing method comprising:

responding to a triggering operation of a user for playing a first video on a client, and obtaining a video identifier of the first video, wherein the client is configured with a first player and a second player;

detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in the first video;

detecting whether a first audio track identifier corresponding to the first segment identifier exists or not, and if so, obtaining a first commentary audio track corresponding to the first audio track identifier;

and controlling the first player to jump to play the first wonderful section of the first video and controlling the second player to play the first commentary track by using the first section position.

Optionally, the method further includes:

controlling the first player to stop playing an original video track corresponding to the first video if the second player plays the first commentary track.

Optionally, before the triggering operation responding to the user playing the first video on the client, the method further includes:

determining the first highlight segment in the first video and generating the first highlight segment information corresponding to the first highlight segment;

constructing a corresponding relation between the first highlight segment information and the video identification of the first video;

obtaining a first narration text corresponding to the first highlight segment;

converting the first narration text into the first narration track using a preset timbre;

constructing a correspondence of the first track identification and the first segment identification of the first commentary track.

Optionally, the method further includes:

after the second player finishes playing the first commentary track, controlling the first player to continue playing the first video and the original video track.

Optionally, the method further includes:

after the second player finishes playing the first commentary audio track, detecting second highlight information corresponding to the video identifier, wherein the second highlight information comprises a second segment identifier corresponding to a second highlight in the first video and a second segment position, and the second segment position is behind the first segment position;

detecting whether a second audio track identification corresponding to the second clip identification exists, and if so, obtaining a second commentary audio track corresponding to the second audio track identification;

and controlling the first player to jump to play the second wonderful section of the first video and controlling the second player to play the second commentary audio track by using the second section position.

Optionally, the method further includes:

performing network buffer monitoring on the first player and the second player;

controlling the first player to stop playing the first highlight and controlling the second player to stop playing the first commentary track when the playable data cached by the first player for the first highlight is smaller than a first preset threshold; if the first playable data is not less than the first preset threshold, controlling the first player to continue playing the first highlight video and controlling the second player to continue playing the first commentary track;

and/or, in the case that second playable data cached by the second player for the first commentary track is smaller than a second preset threshold, controlling the second player to stop playing the first commentary track, and controlling the first player to stop playing the first highlight; and if the second playable data is not less than the second preset threshold, controlling the second player to continue playing the first commentary track, and controlling the second player to continue playing the first highlight video.

Optionally, in a case where the second player plays the first commentary track, controlling the first player to stop playing an original video track corresponding to the first video includes:

controlling the first player not to output audio data to an audio playing component of the first player to stop playing the original video track after decoding audio data of the original video track corresponding to the first video in the case that the second player plays the first commentary track.

A video audio track processing apparatus comprising: a first obtaining unit, a first detecting unit, a second obtaining unit and a first playing unit,

the first obtaining unit is configured to obtain a video identifier of a first video in response to a trigger operation of a user for playing the first video on a client, where the client is configured with a first player and a second player;

the first detection unit is used for detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in the first video;

the second detection unit is used for detecting whether a first audio track identifier corresponding to the first segment identifier exists or not, and if so, the second obtaining unit is triggered;

the second obtaining unit is configured to obtain a first commentary track corresponding to the first track identification;

the first playing unit is configured to control the first player to skip to play the first highlight of the first video and control the second player to play the first commentary track by using the first clip position.

A computer-readable storage medium on which a program is stored, which when executed by a processor implements a video track processing method as in any one of the above.

An electronic device comprising at least one processor, and at least one memory connected to the processor, a bus; the processor and the memory complete mutual communication through the bus; the processor is configured to invoke program instructions in the memory to perform any of the video track processing methods described above.

By means of the technical scheme, the video track processing method and the related device provided by the disclosure can respond to the triggering operation of a user for playing a first video on a client to obtain the video identifier of the first video, and the client is provided with a first player and a second player; detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in a first video; detecting whether a first audio track identifier corresponding to the first segment identifier exists, and if so, obtaining a first commentary audio track corresponding to the first audio track identifier; and controlling the first player to skip to play a first highlight clip of the first video and controlling the second player to play a first commentary audio track by using the first clip position. According to the video playing method and device, the wonderful section of the video and the explaining audio track are played simultaneously through the two players, video files do not need to be modified, the audio-visual atmosphere is improved when a user watches the wonderful section of the video, and the storage pressure of a content distribution network is relieved.

The foregoing description is only an overview of the technical solutions of the present disclosure, and the embodiments of the present disclosure are described below in order to make the technical means of the present disclosure more clearly understood and to make the above and other objects, features, and advantages of the present disclosure more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the disclosure. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

fig. 1 is a schematic flow chart diagram illustrating an implementation manner of a video audio track processing method provided by an embodiment of the present disclosure;

fig. 2 is a schematic flow chart diagram illustrating another implementation of a video audio track processing method provided by an embodiment of the present disclosure;

fig. 3 is an explanatory diagram illustrating an overall process of a video audio track processing method provided by an embodiment of the present disclosure;

fig. 4 is an explanatory diagram illustrating a playing process of the video track processing method provided by the embodiment of the disclosure;

fig. 5 is a schematic structural diagram of a video audio track processing apparatus provided by an embodiment of the present disclosure;

fig. 6 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As shown in fig. 1, a flowchart of an implementation manner of a video audio track processing method provided by an embodiment of the present disclosure may include:

s100, responding to a trigger operation of a user for playing a first video on a client, and obtaining a video identifier of the first video, wherein the client is configured with a first player and a second player.

The client may be an application program providing multimedia services for the user. Optionally, the trigger operation may be a click operation. The user can click a play button of the first video on the client and output a play instruction of the first video. After receiving the play instruction, the embodiment of the disclosure obtains the video identifier of the first video.

Alternatively, the video identification may be a unique asset number of the first video. Alternatively, the video identification may be a 32-bit string.

The first player is used for playing the first video. The second player is for playing the narration track.

S200, first highlight information corresponding to the video identification is detected, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in the first video.

The first highlight segment may be a video segment determined in advance in the first video by an artificial or preset highlight segment determination model. It is understood that the disclosed embodiments may generate corresponding first highlight information for the first highlight in advance.

Alternatively, the first segment identifier may be a unique video segment number for the first highlight segment. Alternatively, the first segment identifier may be a 32-bit string.

Wherein the first segment position is used to indicate a position of the first highlight segment in the first video. Optionally, the first segment position may be a play start time of the first highlight segment in the first video.

Optionally, in the embodiment of the present disclosure, a corresponding relationship between the video identifier and the first highlight information may be pre-constructed, and the first highlight information corresponding to the video identifier may be queried by detecting the video identifier.

Optionally, the embodiment of the present disclosure may perform normal playing on the first video under the condition that the first highlight information corresponding to the video identifier is not detected.

Optionally, based on the method shown in fig. 1, as shown in fig. 2, a flowchart of another implementation manner of the video track processing method provided in the embodiment of the present disclosure is shown, before step S100, the video track processing method may further include:

a100, determining a first highlight segment in a first video, and generating first highlight segment information corresponding to the first highlight segment.

Optionally, any piece of highlight information may be as shown in table 1, and the highlight information may include a piece identifier, a piece title, a narration text, and a piece position.

TABLE 1

Segment identification	Segment title	Narration text	Segment position (unit: second)
				32-bit string	Fun and fun	***	361

For example: the highlight information may be:

“video_wonderful_info{

media_uuid:adbxxxxxxx,

wonderful _ text content of the highlight video narrative track

seek_point:2800

}”。

And A200, constructing a corresponding relation between the first highlight segment information and the video identification of the first video.

Specifically, the embodiment of the present disclosure may construct a corresponding relationship between the segment identifier in the first highlight segment information and the video identifier of the first video. Alternatively, the correspondence between the segment identifier and the video identifier may be as shown in table 2. According to the embodiment of the disclosure, by detecting the first video identifier, the first segment identifier corresponding to the first video identifier can be detected, so as to obtain the first highlight segment information corresponding to the first segment identifier.

TABLE 2

Video identification	File name	Segment identification
			32-bit string	Happy book of great Ben Ying	32-bit string

And A300, obtaining a first narration text corresponding to the first highlight.

Optionally, the comment text may be a comment text edited by a technician in advance for the highlight, or may be a comment text collected and identified by the user corresponding to the highlight.

The text content of the highlight audio track of the highlight video under the field 'the comment text' can be obtained from the first highlight information corresponding to the first highlight.

And A400, converting the first narration text into a first narration audio track by using a preset tone.

Optionally, the embodiment of the present disclosure may convert the narration text into the narration audio track by using a pre-selected timbre through an online mature AI audio platform, and store the narration audio track.

And A500, constructing the corresponding relation between the first audio track identification and the first segment identification of the first commentary audio track.

Alternatively, the first track identification may be a unique track number of the first commentary track. Alternatively, the first track identification may be a 32-bit string.

Alternatively, the correspondence between the track id and the segment id may be as shown in table 3. The disclosed embodiment can detect a first audio track identifier corresponding to the first segment identifier by detecting the first segment identifier, thereby obtaining a first commentary audio track corresponding to the first audio track identifier.

TABLE 3

Audio track identification	Length of time (unit: second)	Segment identification
			32-bit string	90	32-bit string

According to the embodiment of the disclosure, by constructing the corresponding relation among the video identification, the segment identification and the audio track identification, the commentary audio track with rich styles can be generated in a user-defined mode based on the highlight segment information, and the commentary audio track is matched with the highlight segment, so that the audio-visual experience of a user is improved.

S300, whether a first audio track identifier corresponding to the first segment identifier exists is detected, and if the first audio track identifier exists, the step S400 is executed.

Optionally, in the embodiment of the present disclosure, when it is detected that there is no first audio track identifier corresponding to the first segment identifier, the first player may be directly controlled to play the first highlight segment, and the second player is not controlled at all.

S400, obtaining a first commentary audio track corresponding to the first audio track identification.

S500, controlling the first player to jump to play the first wonderful clip of the first video and controlling the second player to play the first commentary audio track by using the first clip position.

Specifically, the control client starts two multimedia play objects (Media Player): the first player and the second player are used for positioning the first segment position of the first video, directly jumping to the first segment position and playing the first highlight segment. In the event that the presence of a first narration track corresponding to the first highlight is detected, cooperatively controlling a second player to play the first narration track.

Optionally, the embodiment of the disclosure may control the second player to play the first commentary track at a preset volume.

Optionally, the embodiment of the present disclosure may control the first player to stop playing the original video track corresponding to the first video when the second player plays the first commentary track.

Specifically, the embodiment of the present disclosure may control the first player not to output the audio data to an audio playing component (audio) of the first player after decoding out the audio data of the original video track corresponding to the first video, so as to stop playing the original video track, in a case where the second player plays the first commentary track.

Optionally, the first player has a function of discarding the audio data. The embodiment of the disclosure may control the first player to perform audio and video synchronization operation (AVSYNC) on the decoded audio of the original video track corresponding to the first video when the second player plays the first commentary track, discard the audio data buffered by the decoded audio, and not output the audio data to the audio playing component of the first player.

Optionally, the embodiment of the present disclosure may control the first player to continue playing the first video and the original video track after the second player finishes playing the first commentary track.

The disclosed embodiments may synchronously update the state after the second player finishes playing the first narration audio track, so that the first player outputs the audio data of the decoded audio buffer of the original video audio track corresponding to the first video to the audio playing component, so that the first player resumes normal playing of the first video.

Alternatively, the commentary tracks and videos may be stored separately in different Content Delivery Networks (CDNs).

In order to facilitate understanding of the overall process of the video track processing method provided by the embodiment of the present disclosure, the following description is made with reference to fig. 3: as shown in fig. 3, highlight information can be edited manually from the media assets of the video, and an commentary track can be intelligently generated from the highlight information. When a user selects to play a video, detecting a video identifier of the video to obtain corresponding highlight information, fast forwarding to a corresponding highlight, judging whether a corresponding explanation audio track exists according to the highlight identifier, if so, pulling the explanation audio track from the explanation audio track CDN to a second player, and sending a mute notification to a first player so that the first player plays the video from the highlight after obtaining the video from the video CDN, and continuing to play the video conventionally after the highlight is played. According to the video transcoding method and device, the video is not required to be transcoded again, the video file is prevented from being damaged by transcoding, and the storage resources of the CDN are saved.

In order to facilitate understanding of the playing process of the video track processing method provided by the embodiment of the present disclosure, the following description is made with reference to fig. 4: as shown in fig. 4, the embodiment of the present disclosure provides a play state management component at the client for managing the play states of the first player and the second player. First a first player and a second player are created separately. The first player separates the audio stream and the video stream of the video to obtain audio data and video data of the video. And after decoding the audio data, the first player judges whether a comment sound track exists, if so, discards the audio data, and if not, outputs the audio data. The first player outputs the video data after decoding the video data. The first player may query the play state of the second player through the play state management component. And after the second player pulls the audio data of the comment audio track, the second player decodes the audio data and plays the audio data according to the preset volume until the comment audio track is played. The second player may update its own play status to the play management component.

Optionally, in the embodiment of the present disclosure, after the second player finishes playing the first narration track, second highlight information corresponding to the video identifier may be detected, where the second highlight information includes a second clip identifier and a second clip position corresponding to a second highlight in the first video, where the second clip position is after the first clip position; detecting whether a second audio track identification corresponding to the second clip identification exists, and if so, obtaining a second commentary audio track corresponding to the second audio track identification; and controlling the first player to jump to play a second wonderful segment of the first video and controlling the second player to play a second commentary audio track by using the second segment position.

It is understood that there may be multiple highlights in a video. The embodiment of the disclosure can detect another highlight in the video after the end of playing the commentary audio track corresponding to the highlight, and play the highlight and the commentary audio track corresponding to the highlight. The embodiment of the disclosure can provide rich audio-visual experience for the user by continuously playing the highlight segments of the video and the corresponding commentary audio tracks.

Optionally, the embodiment of the present disclosure may further perform network buffer monitoring on the first player and the second player. Specifically, in the embodiment of the present disclosure, the play state management component at the client sets a monitor (buffer back) to the first player and the second player, and when a network jitter occurs in any player, the monitor (buffer back) is called back to the play state management component, and the play state management component cooperates with the play state of another player to notify the another player to change the play state when the player recovers the buffering. Since the commentary track does not need to be actually matched with the mouth shape of the video picture in the highlight, the sound-picture synchronization process of the PTS (Presentation Time Stamp) does not need to be performed.

Optionally, when the first playable data cached by the first player for the first highlight is smaller than a first preset threshold, controlling the first player to stop playing the first highlight, and controlling the second player to stop playing the first commentary track; and under the condition that the first playable data is not less than a first preset threshold value, controlling the first player to continuously play the first highlight video and controlling the second player to continuously play the first commentary audio track.

Optionally, when second playable data cached by the second player for the first commentary audio track is smaller than a second preset threshold, the second player is controlled to stop playing the first commentary audio track, and the first player is controlled to stop playing the first highlight; and under the condition that the second playable data is not less than a second preset threshold value, controlling the second player to continue playing the first commentary audio track, and controlling the second player to continue playing the first highlight video.

The playing and ending and synchronous buffer states of the two players can be controlled through the playing state management component, and cooperative control over the two players is achieved.

The video track processing method provided by the present disclosure can respond to a trigger operation of a user for playing a first video on a client, and obtain a video identifier of the first video, wherein the client is configured with a first player and a second player; detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in a first video; detecting whether a first audio track identifier corresponding to the first segment identifier exists, and if so, obtaining a first commentary audio track corresponding to the first audio track identifier; and controlling the first player to jump to play the first wonderful segment of the first video and controlling the second player to play the first commentary audio track by using the first segment position. According to the video playing method and device, the wonderful section of the video and the explaining audio track are played simultaneously through the two players, video files do not need to be modified, the audio-visual atmosphere is improved when a user watches the wonderful section of the video, and the storage pressure of a content distribution network is relieved.

Although the operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

Corresponding to the above method embodiment, an embodiment of the present disclosure further provides a video track processing apparatus, whose structure is shown in fig. 5, and may include: a first obtaining unit 100, a first detecting unit 200, a second detecting unit 300, a second obtaining unit 400, and a first playing unit 500.

A first obtaining unit 100, configured to obtain a video identifier of a first video in response to a trigger operation of a user to play the first video on a client, where the client is configured with a first player and a second player.

The first detecting unit 200 is configured to detect first highlight information corresponding to a video identifier, where the first highlight information includes a first segment identifier and a first segment position corresponding to a first highlight in a first video.

A second detection unit 300, configured to detect whether a first audio track identifier corresponding to the first segment identifier exists, and if so, trigger the second obtaining unit 400.

A second obtaining unit 400 is configured to obtain a first commentary track corresponding to the first track identification.

The first playing unit 500 is configured to control the first player to skip to play the first highlight of the first video and control the second player to play the first commentary track by using the first clip position.

Optionally, the video track processing apparatus may further include: a first stopping unit.

And the first stopping unit is used for controlling the first player to stop playing the original video track corresponding to the first video under the condition that the second player plays the first commentary track.

Optionally, the video track processing apparatus may further include: the device comprises a first determining unit, a first constructing unit, a third obtaining unit, a first converting unit and a second constructing unit.

The first obtaining unit 100 is configured to determine a first highlight in the first video before responding to a trigger operation of a user for playing the first video on the client, and generate first highlight information corresponding to the first highlight.

And the first construction unit is used for constructing the corresponding relation between the first highlight segment information and the video identification of the first video.

A third obtaining unit, configured to obtain a first narrative text corresponding to the first highlight.

A first conversion unit, configured to convert the first narration text into a first narration track using a preset timbre.

And the second construction unit is used for constructing the corresponding relation between the first audio track identifier and the first segment identifier of the first commentary audio track.

Optionally, the video track processing apparatus may further include: a second playback unit.

And the second playing unit is used for controlling the first player to continue playing the first video and the original video track after the second player finishes playing the first commentary track.

Optionally, the video track processing apparatus may further include: the device comprises a third detection unit, a fourth acquisition unit and a third playing unit.

A third detecting unit, configured to detect second highlight information corresponding to the video identifier after the second player finishes playing the first commentary audio track, where the second highlight information includes a second clip identifier and a second clip position that correspond to a second highlight in the first video, and the second clip position is after the first clip position.

And the fourth detection unit is used for detecting whether a second audio track identifier corresponding to the second segment identifier exists or not, and if so, triggering the fourth obtaining unit.

A fourth obtaining unit configured to obtain a second commentary track corresponding to the second track identification.

And the third playing unit is used for controlling the first player to skip to play a second wonderful segment of the first video and controlling the second player to play a second commentary audio track by using the second segment position.

Optionally, the video track processing apparatus may further include: the device comprises a first monitoring unit, a first control unit and/or a second control unit.

The first monitoring unit is used for carrying out network buffer monitoring on the first player and the second player;

the first control unit is used for controlling the first player to stop playing the first highlight segment and controlling the second player to stop playing the first commentary track under the condition that the first playable data cached by the first player on the first highlight segment is smaller than a first preset threshold; and under the condition that the first playable data is not less than the first preset threshold value, controlling the first player to continue playing the first highlight video and controlling the second player to continue playing the first commentary audio track.

The second control unit is used for controlling the second player to stop playing the first commentary audio track and controlling the first player to stop playing the first wonderful section under the condition that second playable data cached by the second player on the first commentary audio track is smaller than a second preset threshold value; and under the condition that the second playable data is not less than a second preset threshold value, controlling the second player to continue playing the first commentary audio track, and controlling the second player to continue playing the first highlight video.

Optionally, the first stopping unit is specifically configured to, when the second player plays the first commentary track, control the first player to not output the audio data to the audio playing component of the first player after decoding the audio data of the original video track corresponding to the first video, so as to stop playing the original video track.

The video track processing device provided by the disclosure can respond to the triggering operation of playing a first video on a client by a user to obtain a video identifier of the first video, wherein the client is provided with a first player and a second player; detecting first highlight information corresponding to the video identification, wherein the first highlight information comprises a first segment identification and a first segment position corresponding to a first highlight in a first video; detecting whether a first audio track identification corresponding to the first segment identification exists or not, and if so, obtaining a first commentary audio track corresponding to the first audio track identification; and controlling the first player to jump to play the first wonderful segment of the first video and controlling the second player to play the first commentary audio track by using the first segment position. According to the video playing method and device, the wonderful section of the video and the explaining audio track are played simultaneously through the two players, video files do not need to be modified, the audio-visual atmosphere is improved when a user watches the wonderful section of the video, and the storage pressure of a content distribution network is relieved.

With regard to the apparatus in the above-described embodiment, the specific manner in which each unit performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.

The video track processing device comprises a processor and a memory, wherein the first obtaining unit 100, the first detecting unit 200, the second detecting unit 300, the second obtaining unit 400, the first playing unit 500 and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.

The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, the highlight segments of the videos and the explaining audio tracks are played simultaneously through the two players by adjusting the kernel parameters, the video files do not need to be modified, and the storage pressure of a content distribution network is relieved while the audio-visual atmosphere is improved.

The disclosed embodiments provide a computer-readable storage medium having stored thereon a program that, when executed by a processor, implements the video track processing method.

The disclosed embodiment provides a processor for executing a program, wherein the program executes to execute the video track processing method.

As shown in fig. 6, an embodiment of the present disclosure provides an electronic device 1000, where the electronic device 1000 includes at least one processor 1001, and at least one memory 1002 and a bus 1003 connected to the processor 1001; the processor 1001 and the memory 1002 complete communication with each other through the bus 1003; the processor 1001 is used to call program instructions in the memory 1002 to perform the above-described video track processing method. The electronic device herein may be a server, a PC, a PAD, a mobile phone, etc.

The present disclosure also provides a computer program product adapted to perform a program initialized with video soundtrack processing method steps when executed on an electronic device.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, electronic devices (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, an electronic device includes one or more processors (CPUs), memory, and a bus. The electronic device may also include input/output interfaces, network interfaces, and the like.

The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), including at least one memory chip. The memory is an example of a computer-readable medium.

Computer-readable media, including both permanent and non-permanent, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

In the description of the present disclosure, it is to be understood that the directions or positional relationships indicated as referring to the terms "upper", "lower", "front", "rear", "left" and "right", etc., are based on the directions or positional relationships shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the positions or elements referred to must have specific directions, be constituted and operated in specific directions, and thus, are not to be construed as limitations of the present disclosure.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The above are merely examples of the present disclosure, and are not intended to limit the present disclosure. Various modifications and variations of this disclosure will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the scope of the claims of the present disclosure.

Claims

1. A method of processing a video soundtrack, comprising:

2. The method of claim 1, further comprising:

3. The method of claim 1, wherein prior to the triggering operation in response to the user playing the first video on the client, the method further comprises:

obtaining a first narration text corresponding to the first highlight segment;

converting the first narration text into the first narration audio track by using a preset tone color;

constructing a correspondence of the first audio track identification and the first segment identification of the first commentary audio track.

4. The method of claim 2, further comprising:

5. The method of claim 1, further comprising:

detecting whether a second audio track identifier corresponding to the second clip identifier exists, and if so, obtaining a second commentary audio track corresponding to the second audio track identifier;

6. The method of claim 1, further comprising:

performing network buffer monitoring on the first player and the second player;

7. The method of claim 2, wherein the controlling the first player to stop playing an original video track corresponding to the first video if the second player plays the first commentary track comprises:

8. A video soundtrack processing apparatus, comprising: a first obtaining unit, a first detecting unit, a second obtaining unit and a first playing unit,

the second detection unit is used for detecting whether a first audio track identifier corresponding to the first segment identifier exists or not, and if so, the second acquisition unit is triggered;

9. A computer-readable storage medium on which a program is stored, the program, when being executed by a processor, implementing a video audio track processing method according to any one of claims 1 to 7.

10. An electronic device comprising at least one processor, and at least one memory connected to the processor, a bus; the processor and the memory are communicated with each other through the bus; the processor is configured to invoke program instructions in the memory to perform the video track processing method of any of claims 1 to 7.