CN110944214B

CN110944214B - Method, device, equipment, system and storage medium for intercepting high-tide video segments of songs

Info

Publication number: CN110944214B
Application number: CN201911240857.XA
Authority: CN
Inventors: 伍威威; 韦传毅; 彭剑龙; 姚俊; 莫钦善
Original assignee: Guangzhou Kugou Computer Technology Co Ltd
Current assignee: Guangzhou Kugou Computer Technology Co Ltd
Priority date: 2019-12-06
Filing date: 2019-12-06
Publication date: 2021-09-14
Anticipated expiration: 2039-12-06
Also published as: CN110944214A

Abstract

The application discloses a method, a device, equipment, a system and a storage medium for intercepting high-tide video segments of songs, and belongs to the technical field of internet. The method comprises the following steps: in the video live broadcasting process, when a singing instruction corresponding to a target song is received, playing accompaniment audio data of the target song and sending a singing starting notice to the server; when the playing of the accompaniment audio data is finished, sending a singing ending notice to the server; receiving address information of the high-grade video segments of the songs sent by the server; and downloading the high-grade video segment of the song based on the address information. The live video in the singing process is imported into the video clip application program without starting the video clip application program, and the high-tide video segment is intercepted. The method provided by the embodiment of the application has simple steps and improves the intercepting efficiency of the video segments of high climax.

Description

Method, device, equipment, system and storage medium for intercepting high-tide video segments of songs

Technical Field

The present application relates to the field of internet technologies, and in particular, to a method, an apparatus, a device, a system, and a storage medium for capturing a video segment of a high genre of songs.

Background

With the rapid development of the live broadcast industry, more and more users enjoy singing songs on a live broadcast platform. In the process of live video broadcasting, a video segment corresponding to a part of a song climax sung by a user can be intercepted, and then the video segment of the song climax is played or shared.

In the related art, a user may install a video clip application for intercepting a high-grade video segment at a terminal. In the video live broadcasting process, a user can operate a live broadcasting application program to play accompaniment and perform singing, the live broadcasting application program can intercept live broadcasting video in the singing process, and the intercepted live broadcasting video is stored locally. Then, the user can import the live video of the singing process into the video clip application program, and operate the video clip application program to intercept the climax video segment in the live video.

According to the method, the video editing application program needs to be started, the live video in the singing process is imported into the video editing application program, the high-tide video band is intercepted, the steps are complex, and the intercepting efficiency of the high-tide video band is reduced.

Disclosure of Invention

The embodiment of the application provides a method, a device, equipment, a system and a storage medium for intercepting high-tide video segments of songs, which can solve the problems of complicated steps and low intercepting efficiency of the high-tide video segments. The technical scheme is as follows:

in one aspect, a method for intercepting a high-grade video segment of a song is provided, and the method comprises the following steps:

in the video live broadcasting process, when a singing instruction corresponding to a target song is received, playing accompaniment audio data of the target song and sending a singing starting notice to a server;

when the playing of the accompaniment audio data is finished, sending a singing ending notice to the server;

receiving address information of the song climax video segment sent by the server;

and downloading the high-grade video segment of the song based on the address information.

In one possible implementation, after the sending of the singing end notification to the server, the method further includes:

receiving a thumbnail video of the song climax video segment sent by the server;

displaying a download prompting window, and playing the thumbnail video in the download prompting window;

the downloading of the song climax video segment based on the address information comprises:

and when a confirmation instruction triggered by the downloading prompt window is received, downloading the high-grade song video segment based on the address information.

In one possible implementation, the address information of the song climax video segment includes address information of a plurality of song climax video segments, and after the singing end notification is sent to the server, the method further includes:

receiving thumbnail videos of a plurality of song climax video segments sent by the server;

displaying a download prompting window, and displaying options corresponding to a plurality of thumbnail videos in the download prompting window;

when a selection instruction of a first thumbnail video in the plurality of thumbnail videos is received, playing the first thumbnail video;

and when a downloading instruction corresponding to the second thumbnail video is received, downloading a song climax video segment corresponding to the second thumbnail video based on the address information of the second thumbnail video.

receiving a singing starting notice and a singing ending notice which are sent by a terminal in the live video broadcasting process;

acquiring live broadcast video uploaded by the terminal in a time period of receiving the singing starting notice and the singing ending notice, and acquiring audio data in the live broadcast video;

searching target song audio data with the highest matching degree with the audio data in song audio data stored in a song library;

determining a target climax time period corresponding to the target song audio data according to a corresponding relation between pre-stored song audio data and the climax time period;

and intercepting a song climax video segment in the live video based on the target climax time segment.

In one possible implementation, after intercepting a song climax video segment in the live video based on the target climax time segment, the method further includes:

storing the high-tide video segment and acquiring the address information of the high-tide video segment;

and sending the address information of the high-grade video segment to the terminal.

generating a thumbnail video of the song climax video segment;

and sending the thumbnail video of the song climax video segment to the terminal.

In one possible implementation manner, the target song audio data corresponds to a plurality of target climax time periods, the live video includes a plurality of song climax video segments, and after the song climax video segments are intercepted in the live video based on the target climax time periods, the method further includes:

generating a thumbnail video of the climax video section of each song;

and sending the thumbnail video of the climax video segment of each song to the terminal.

In another aspect, an apparatus for intercepting a video segment of a high genre of songs is provided, the apparatus comprising:

the first sending module is used for playing accompaniment audio data of a target song and sending a singing starting notice to a server when a singing instruction corresponding to the target song is received in a live video broadcasting process;

the second sending module is used for sending a singing ending notice to the server when the playing of the accompaniment audio data is finished;

the first receiving module is used for receiving the address information of the video segment of the song climax sent by the server;

and the downloading module is used for downloading the high-grade song video segment based on the address information.

In one possible implementation manner, after the sending of the singing end notification to the server, the apparatus further includes:

the second receiving module is used for receiving the thumbnail video of the song climax video segment sent by the server;

the display and play module is used for displaying a download prompt window and playing the thumbnail video in the download prompt window;

the download module is further configured to:

In one possible implementation manner, the address information of the song climax video segment includes address information of a plurality of song climax video segments, and after the singing end notification is sent to the server, the apparatus further includes:

the third receiving module is used for receiving the thumbnail videos of the plurality of song climax video segments sent by the server;

the display module is used for displaying a download prompting window and displaying options corresponding to a plurality of thumbnail videos in the download prompting window;

the playing module is used for playing a first thumbnail video in the plurality of thumbnail videos when a selection instruction of the first thumbnail video is received;

the download module is further configured to:

the receiving module is used for receiving a singing starting notice and a singing ending notice which are sent by the terminal in the video live broadcasting process;

the acquisition module is used for acquiring live broadcast videos uploaded by the terminal in the time period of receiving the singing starting notification and the singing ending notification and acquiring audio data in the live broadcast videos;

the searching module is used for searching the audio data of the target song with the highest matching degree with the audio data in the audio data of the songs stored in the song library;

the determining module is used for determining a target climax time period corresponding to the target song audio data according to the corresponding relation between the pre-stored song audio data and the climax time period;

and the intercepting module is used for intercepting a song climax video segment in the live video based on the target climax time segment.

In one possible implementation manner, after intercepting a song climax video segment in the live video based on the target climax time segment, the apparatus further includes:

the storage and acquisition module is used for storing the high-grade video segment and acquiring the address information of the high-grade video segment;

and the first sending module is used for sending the address information of the climax video segment to the terminal.

the first generation module is used for generating a thumbnail video of the song climax video segment;

and the second sending module is used for sending the thumbnail video of the song climax video segment to the terminal.

In one possible implementation manner, the target song audio data corresponds to a plurality of target climax time periods, the live video includes a plurality of song climax video segments, and the apparatus further includes, after the song climax video segments are intercepted in the live video based on the target climax time periods:

the second generation module is used for generating a thumbnail video of each song climax video segment;

and the third sending module is used for sending the thumbnail video of each song climax video segment to the terminal.

In another aspect, a system for intercepting high-grade video segments of a song is provided, which includes: terminal and server, wherein:

the terminal is used for playing accompaniment audio data of a target song and sending a singing starting notice to the server when a singing instruction corresponding to the target song is received in the live video broadcasting process; when the playing of the accompaniment audio data is finished, sending a singing ending notice to the server; receiving address information of the song climax video segment sent by the server; downloading the high-grade video segment of the song based on the address information;

the server is used for receiving the singing starting notification and the singing ending notification sent by the terminal in the video live broadcasting process; acquiring live broadcast video uploaded by the terminal in a time period of receiving the singing starting notice and the singing ending notice, and acquiring audio data in the live broadcast video; searching target song audio data with the highest matching degree with the audio data in song audio data stored in a song library; determining a target climax time period corresponding to the target song audio data according to a corresponding relation between pre-stored song audio data and the climax time period; and intercepting a song climax video segment in the live video based on the target climax time segment.

In still another aspect, a computer device is provided and includes a processor and a memory, where the memory stores at least one instruction, and the at least one instruction is loaded and executed by the processor to implement the operations performed by the song climax video segment capturing method.

In still another aspect, a computer-readable storage medium is provided, in which at least one instruction is stored, and the at least one instruction is loaded and executed by a processor to implement the operations performed by the method for capturing a high-grade video segment of a song.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

in the live video broadcasting process, when a terminal receives a singing instruction corresponding to a target song, playing accompaniment audio data of the target song, sending a singing starting notice to a server, and when the accompaniment audio data is played completely, sending a singing ending notice to the server by the terminal. And the terminal receives the address information of the song climax video segment sent by the server and downloads the song climax video segment according to the address information. The method has the advantages that the video clip application program does not need to be started, the live video in the singing process is guided into the video clip application program, and the high-tide video band is intercepted.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of an implementation environment of a method for capturing video segments of a song climax according to an embodiment of the present application;

fig. 2 is a flowchart of a terminal side in a method for capturing a video segment of a song climax according to an embodiment of the present application;

fig. 3 is a flowchart of a server side in a method for capturing a video segment of a song climax according to an embodiment of the present application;

fig. 4 is a flowchart illustrating interaction between a terminal and a server in a method for capturing video segments of a song climax according to an embodiment of the present application;

FIG. 5 is a schematic interface diagram of a method for capturing video segments of a high genre of songs according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an interface for selecting the audio data of the accompaniment of the target song according to the method for capturing video segments of high-grade songs provided by the embodiment of the present application;

FIG. 7 is a block diagram of a method for capturing video segments of a high genre of songs according to an embodiment of the present application;

FIG. 8 is an interface diagram of a download prompt window of a method for capturing video segments of a high genre of songs according to an embodiment of the present application;

FIG. 9 is an interface diagram of a download prompt window of a method for capturing video segments of a high genre of songs according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an apparatus for intercepting a video segment of a high genre provided in an embodiment of the present application;

fig. 11 is a schematic structural diagram of an apparatus for intercepting a video segment of a high genre according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a terminal according to an embodiment of the present application;

fig. 13 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Fig. 1 is a schematic diagram of an implementation environment of a method for capturing a video segment of a song climax according to an embodiment of the present application. Referring to fig. 1, the implementation environment includes: a terminal and a server. The method for intercepting the video segments of the high-grade songs can be realized by the terminal and the server together.

The terminal may establish communication with the server through a wireless network or a wired network. The terminal may be at least one of a smartphone, a desktop computer, a tablet computer, and a laptop portable computer. The terminal may have components such as a camera and a speaker, and may also be installed and run with an application program that supports live broadcast services. The application program can be any one of a video viewing program, a social application program, an instant messaging application program and an information sharing program.

As an example, the terminal may be used to refer to a terminal used by a host user, and a host user account and a general user account may be respectively logged in an application running in the terminal. In the process of live broadcasting of the video, when a singing operation instruction triggered by a main broadcasting user is received, the terminal can be used for sending a song starting notice to the server, and when the song is played completely, the terminal can also be used for sending a song ending notice to the server. Uploading live broadcast video in a time period from the start of singing to the end of singing to a server, and after the video segment of the song in the climax is intercepted, receiving address information of the video segment of the song in the climax sent by the server by the terminal, and displaying a download prompting window sent by the server on the terminal to prompt a master user that the video segment of the song in the climax is intercepted and can be downloaded.

The server may be a background server for installing and running the application program in the terminal, the server may be a single server or a server group, if the server is a single server, the server may be responsible for all processing in the following scheme, if the server group is a server group, different servers in the server group may be respectively responsible for different processing in the following scheme, and specific processing allocation conditions may be arbitrarily set by technical personnel according to actual requirements, and are not described herein any more.

As an example, a server may be used to refer to a server that intercepts live video. In the live broadcasting process, the server can receive a singing starting notification and a singing ending notification sent by the terminal, then obtain live broadcasting videos in the singing starting and singing ending time periods, and intercept the video segments of the song climax for the live broadcasting videos.

The server may include a song library, a user information database, a user behavior information database, and the like, in which audio data of songs and corresponding climax periods of songs are stored. The song library is used for searching song audio data with the highest matching degree with audio data in live broadcast videos uploaded by the terminal so as to determine which song the song accompaniment used by a master user when singing is specific. The song library may also be used to find the climax segments corresponding to the song audio data to determine which part of the song climax segments in the live video is specific. The user information database is used for storing data information and the like of a user, and the user behavior information database is used for storing network behaviors of the user on the server, such as recording live videos, releasing or sharing high-grade song videos and the like. Of course, the server may also include other functional servers in order to provide more comprehensive and diversified services.

A terminal may refer to one of a plurality of terminals, and this embodiment is only illustrated by a terminal. Those skilled in the art will appreciate that the number of terminals described above may be greater or fewer. For example, the number of the terminals may be only a few, or the number of the terminals may be several tens or hundreds, or more, and the number of the terminals and the type of the device are not limited in the embodiment of the present application.

Fig. 2 is a flowchart of a terminal side in a method for capturing a video segment of a song climax according to an embodiment of the present application. Referring to fig. 2, the embodiment includes:

201. in the video live broadcasting process, when a singing instruction corresponding to a target song is received, playing accompaniment audio data of the target song, and sending a singing starting notice to a server.

202. And when the playing of the accompaniment audio data is finished, sending a singing ending notice to the server.

203. And receiving the address information of the song climax video segment sent by the server.

204. And downloading the high-grade video segment of the song based on the address information.

In one possible implementation, after sending the singing end notification to the server, the method further includes:

the downloading the video segment of the song climax based on the address information comprises:

and when a confirmation instruction triggered by the downloading prompt window is received, downloading the high-grade video segment of the song based on the address information.

and when a downloading instruction corresponding to the second thumbnail video is received, downloading the song climax video segment corresponding to the second thumbnail video based on the address information of the second thumbnail video.

All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present application, and are not described herein again.

Fig. 3 is a flowchart of a server side in a method for capturing a video segment of a song climax according to an embodiment of the present application. Referring to fig. 3, the embodiment includes:

301. receiving a singing starting notice and a singing ending notice which are sent by a terminal in the live video broadcasting process;

302. acquiring a live broadcast video uploaded by the terminal in a time period of receiving the singing start notification and the singing end notification, and acquiring audio data in the live broadcast video;

303. searching target song audio data with the highest matching degree with the audio data in song audio data stored in a song library;

304. determining a target climax time period corresponding to the target song audio data according to a pre-stored corresponding relation between the song audio data and the climax time period;

305. and intercepting a song climax video segment in the live video based on the target climax time segment.

generating a thumbnail video of the song climax video segment;

and transmitting the thumbnail video of the song climax video segment to the terminal.

generating a thumbnail video of the climax video section of each song;

and transmitting the thumbnail video of the climax video segment of each song to the terminal.

Fig. 4 is a flowchart of interaction between a terminal and a server in a method for capturing video segments of a song climax according to an embodiment of the present application. Referring to fig. 4, the embodiment includes:

401. in the video live broadcast process, when a terminal receives a singing instruction corresponding to a target song, accompaniment audio data of the target song is played.

In the embodiment of the application, a live application program supporting live broadcast service can be installed and operated on the terminal so as to realize a video live broadcast function.

The target song is a song selected by the anchor user to sing in the live broadcast application program, and can be provided by the application program. The accompaniment audio data refers to digitized sound data of different versions of the accompaniment of the target song, and can also be provided by the application program.

In implementation, the terminal can provide an interface for selecting a target song and an interface for selecting accompaniment audio data of the target song, the terminal can also select to download the accompaniment audio data through the server, and the server can be provided with various versions of the accompaniment audio data. Fig. 5 is a schematic diagram of an interface for selecting a target song. As shown in fig. 5, a plurality of target songs may be provided, and an accompaniment selection key corresponding to each target song may be provided. And when the trigger operation of the anchor user on any target song selection key is detected, the current interface jumps to an accompaniment audio data interface corresponding to the target song. As shown in fig. 6, fig. 6 is a schematic view of an interface for selecting the audio data of the accompaniment of the target song. Accompaniment audio data for multiple versions of a target song may be provided, as well as a singing button. When the trigger operation of a host user on any one accompaniment audio data singing key is detected, the terminal sends a singing request carrying a target song to the server, after the server receives the request, the server downloads corresponding accompaniment audio data from the server and sends a singing instruction corresponding to the target song to the terminal, and when the terminal receives the singing instruction corresponding to the target song, the accompaniment audio data is played at the terminal.

For example, when the trigger operation of the anchor user on the 'say nothing cry' selection key is detected, the current interface jumps from fig. 5 to fig. 6, the anchor user can select the original singing accompaniment of the 'say nothing cry', and after the singing key is clicked, the terminal sends a singing request carrying the original singing accompaniment of the 'say nothing cry' to the server, and after the server receives the request, the server downloads the original singing accompaniment audio data of the 'say nothing cry' from the server and sends a singing instruction of the 'say nothing cry' to the terminal, and when the terminal receives the singing instruction, the original singing accompaniment audio data of the 'say nothing cry' is played at the terminal.

402. The terminal transmits a singing start notification to the server.

Wherein the singing start notification is for the terminal to notify the server that a singing instruction of the target song is started to be executed. The singing start notification carries a target song identifier, a live broadcast room identifier, and a singing start time identifier of the target song. The target song identifier is a unique identifier for identifying the target song, and the live room identifier is a unique identifier for identifying the target live room. The singing start time identifier of the target song refers to a time identifier at which the accompaniment audio data of the target song starts to be played.

The target song identifier, the live broadcast room identifier, and the singing start time identifier of the target song may be composed of at least one of numbers and letters, which is not limited in the embodiment of the present application.

In implementation, when the trigger operation of the anchor user on any one of the accompaniment audio data singing keys is detected, the terminal sends a singing start notification carrying the target song identifier, the live broadcast room identifier and the singing start time identifier of the target song to the server through the network interface, namely, the operation of the anchor user for triggering selection of the accompaniment audio data and the operation action of the terminal for sending the singing start notification to the server occur simultaneously.

For example, the anchor user in the live broadcast room # 1 starts playing the original singing accompaniment of "easy to speak and not cry" at 2:06, and at the same time, the terminal sends the singing start notification carrying the identifier of "easy to speak and not cry", the identifier of the live broadcast room # 1 and the identifier of which the singing start time is 2:06 to the server through the network interface.

In implementation, the terminal may send the singing start notification carrying the target song identifier, the live broadcast room identifier, and the singing start time identifier of the target song to the server through a hypertext transfer protocol. The hypertext transfer Protocol is a simple request-response Protocol based on a TCP (Transmission Control Protocol) Protocol, and specifies what information a client in a terminal may send to a server and what response the client may receive. In particular, a client may refer to a live application program supporting a live service.

403. And the server receives a singing starting notice sent by the terminal in the live video broadcasting process.

In the video live broadcast process, the anchor user can play the accompaniment audio data of the target song, can sing the target song and can also perform some actions in cooperation with the accompaniment audio data of the target song. The embodiment of the present application does not limit this.

In implementation, the server may receive a singing start notification that the terminal transmits a singing start identifier carrying a target song identifier, a live broadcast room identifier, and a singing start time identifier of the target song during a live video broadcast.

For example, the anchor user in the live broadcast room # 1 starts playing the original singing accompaniment of "easy to speak and not cry" at 2:06, and at the same time, the terminal sends the singing start notification carrying the identifier of "easy to speak and not cry", the identifier of the live broadcast room # 1 and the identifier of which the singing start time is 2:06 to the server through the network interface. The server can receive a singing start notification which is sent by the terminal and carries a 'no crying to speak' identifier, a number 1 live broadcast room identifier and an identifier with the singing start time of 2:06 in the live video broadcast process.

404. And when the playing of the accompaniment audio data is finished, the terminal sends a singing ending notice to the server.

Wherein, the singing ending notice is used for the terminal to inform the server that the playing of the accompaniment audio data is finished. The singing end notification carries a target song identifier, a live broadcast room identifier, and a singing end time identifier of the target song. The playing of the accompaniment audio data can be finished by reaching the tail of the accompaniment audio data or actively triggering the key operation for finishing the playing by a main player user, and the embodiment of the application does not limit the key operation.

In implementation, when the playing of the accompaniment audio data is finished, the terminal sends a singing ending notice carrying the target song identifier, the live broadcast room identifier and the singing ending time identifier of the target song to the server through the network interface.

For example, when the original singing accompaniment of 2:11 ' no crying to speak ' is played by the anchor user in the number 1 live broadcasting room, the terminal sends a singing ending notice carrying an identifier of no crying to speak ' and an identifier of the number 1 live broadcasting room to the server through the network interface, wherein the singing ending notice carries the identifier of 2: 11.

405. And the server receives a singing ending notice sent by the terminal in the live video broadcasting process.

In implementation, the server may receive a singing end notification that the terminal transmits a singing end time identifier carrying a target song identifier, a live broadcast room identifier, and a target song during a live video broadcast.

For example, when the original singing accompaniment of 2:11 ' no crying to speak ' is played by the anchor user in the number 1 live broadcasting room, the terminal sends a singing ending notice carrying an identifier of no crying to speak ' and an identifier of the number 1 live broadcasting room to the server through the network interface, wherein the singing ending notice carries the identifier of 2: 11. The server can receive a singing ending notice which is sent by the terminal in the live video broadcasting process and carries the 'no crying to speak' identifier, the number 1 live broadcasting room identifier and the singing ending time identifier of 2: 11.

Note that, in the above steps 402 to 405, the terminal transmits the singing start notification and the singing end notification to the server as two independent behaviors of the terminal, and the server receives the singing start notification and the singing end notification transmitted by the terminal as two independent behaviors of the server.

406. The server acquires live video uploaded by the terminal in a time period of receiving the singing start notification and the singing end notification, and acquires audio data in the live video.

In an implementation, the server may store the received target song identifier, the live broadcast room identifier, and the singing start time identifier of the target song in a form of a three-item table in the server, or store the received target song identifier, the live broadcast room identifier, and the singing end time identifier of the target song in a form of a three-item table in the server, so that the server queries the singing start time and the singing end time corresponding to singing the target song in the live broadcast room.

For example, table 1 is a singing start notification information table received by the server. The server may receive the following information reports:

TABLE 1

Live broadcast room	Target song	Singing start time
			1	Say not cry	1:20
2	Picture	2:00
			3	Get on the wind	10:15

If the server needs to inquire the start time corresponding to the "say good not cry" played by the anchor user in the number 1 live broadcast room, the start time of singing can be determined to be 1:20 according to the first line record in the table 1 stored by the server.

For example, table 2 is a singing end notification information table received by the server. The server may receive the following information reports:

TABLE 2

Live broadcastRoom	Target song	Singing ending time
			1	Say not cry	1:27
2	Picture	2:05
			3	Get on the wind	10:21

If the server needs to inquire the end time corresponding to the "say good not cry" played by the anchor user in the number 1 live broadcast room, the end time of singing can be determined to be 1:27 according to the first line record in the table 1 stored by the server.

In implementation, the target song sung by each anchor user, the singing start time, and the singing end time may be the same or different, and this is not limited in this application.

For example, during live video, the anchor user in live room # 1 selects "say not cry" at 3:00 and begins singing, and ends singing at 3: 05. The anchor user in live room # 2 could also select "say not cry" at 3:00 and start singing and end singing at 3: 05. The anchor user in live room # 3 also selected "say good not cry" and started singing at 4:15 and ended singing at 4: 20.

In implementation, the server can judge which anchor user's live broadcast room the uploaded live broadcast video comes from according to the live broadcast room identifier sent by the terminal in the live broadcast process of the video, and further determine the target song, the singing start time and the singing end time corresponding to the live broadcast room according to the determined live broadcast room. The server acquires a live broadcast video of the live broadcast room from the singing starting time to the singing ending time, then intercepts the live broadcast video, acquires a live broadcast video segment of the live broadcast room from the singing starting time to the singing ending time, and further acquires audio data in the live broadcast video.

The live video segment refers to a video segment generated by capturing a specified time range from a live streaming data archive file.

For example, the server can intercept the live broadcast video from the "say good not cry" in the live broadcast room No. 1 from the start time of singing to the end time of singing to 1:27, and then the live broadcast video corresponding to the complete song of "say good not cry" can be obtained.

407. And the server searches the audio data of the target song with the highest matching degree with the audio data in the audio data of the songs stored in the song library.

The song audio data may include original song audio data of a target song, and may also include accompaniment audio data of different versions of the target song, where the original song audio data and the accompaniment audio data both carry a target song identifier.

In implementation, after the server acquires the audio data in the live video, the server searches for the audio data of the target song with the highest matching degree with the audio data in the audio data of the songs stored in the song library. The server can search the song audio data corresponding to the target song identifier according to the corresponding relation between the target song identifier and the song audio data, and can judge whether the song audio data is original song audio data or accompaniment song audio data according to the song information carried in the song audio data. The song information comprises field information used for distinguishing original song audio data or accompaniment audio data of the song.

It should be noted that the target song audio data with the highest matching degree with the audio data is searched, and the highest matching degree may be the matching degree completely consistent, or the matching degree may reach a certain proportion, for example, the matching degree may reach over 90%. The embodiment of the present application does not limit this.

408. And the server determines a target climax time period corresponding to the target song audio data according to the corresponding relation between the pre-stored song audio data and the climax time period.

The climax time period refers to a time period corresponding to the refrain part in the target song. The high-grade video segment refers to a video segment generated in a time period of intercepting the paradigms part in the target song from a direct-broadcast streaming data archive file.

In implementation, for the song audio data judged by the server, no matter the song audio data is the original singing audio data or the accompaniment audio data, the climax time period corresponding to the song audio data can be finally determined. And if the song audio data is the original song audio data, directly determining the target climax time period corresponding to the target song audio data according to the pre-stored corresponding relation between the original song audio data and the climax time period. Because the target climax time period corresponding to the target song audio data is generally marked in the original song audio data, if the song audio data is the song accompaniment audio data, the corresponding original song audio data can be found through the song accompaniment audio data according to the corresponding relation between the original song audio data and the song accompaniment audio data which are stored in the song library in advance, and then the target climax time period corresponding to the target song audio data is determined according to the corresponding relation between the original song audio data and the climax time period which are stored in advance.

For example, the climax time period corresponding to the original singing audio data of the 'saying good and no crying' may be within 1 minute 15 seconds to 2 minutes 22 seconds of the server receiving the start timing of the singing start notification, and if the server detects that the song audio data is the original singing audio data of the song, the server may directly determine that the climax time period corresponding to the original singing audio data of the 'saying good and no crying' is within 1 minute 15 seconds to 2 minutes 22 seconds. If the server detects the audio data of the pitch down version of the 'say no cry', the server finds the corresponding original singing audio data of the 'say no cry' according to the corresponding relation between the original singing audio data of the 'say no cry' pre-stored in the song library and the audio data of the pitch down version of the 'say no cry', and finally determines that the time period of the climax corresponding to the 'say no cry' is within 1 minute 15 seconds to 2 minutes 22 seconds according to the climax time period corresponding to the original singing audio data of the 'say no cry' pre-stored.

It should be noted that there may be one or more target climax time periods corresponding to the target song audio data, which is not limited in this embodiment of the present application.

409. And the server intercepts a song climax video segment from the live video based on the target climax time segment.

In implementation, the server intercepts a song climax video segment from the live video according to a target climax time segment corresponding to the obtained target song audio data.

For example, the server obtains the high-tide time period corresponding to the 'say not cry' within 1 min 15 sec to 2 min 22 sec, and the high-tide time period corresponding to the 'say not cry' within 1 min 15 sec to 2 min 22 sec can be intercepted from the live video corresponding to the 'say not cry'.

410. The server stores the high-tide video segment and acquires the address information of the high-tide video segment.

The address information refers to playing network address information for playing the high-grade video segment.

In implementation, after intercepting the high-tide video segment, the server stores the high-tide video segment in the server, the high-tide video segment can generate unique corresponding address information, and then the server can acquire the address information of the high-tide video segment.

For example, after the server captures the high-tide video segment, the captured high-tide video segment can be stored in a cloud storage mode so as to be not easy to lose, the information security is high, the high-tide video segment in the cloud storage can generate multiple copies, and the experience of the anchor user cannot be influenced in case of loss of the high-tide video segment. The server can also obtain the corresponding address information generated by the climax video segment. The cloud storage is a storage mode of online storage on the internet, namely, data is stored in one or more virtual servers hosted by a third party.

411. And the server sends the address information of the high-grade video segment to the terminal.

In implementation, the climax video segment can generate unique address information, and the server transmits the address information to the terminal in an address link mode.

412. And the terminal receives the address information of the song climax video segment sent by the server.

In implementation, after the server transmits the address information to the terminal in an address link manner, the terminal may receive the address information of the song climax video segment transmitted by the server. The address information may be displayed on the terminal in the form of a string of network address symbols, or may be displayed on the terminal in the form of a high-grade video segment thumbnail. The embodiment of the present application does not limit this.

413. And the terminal downloads the high-grade video segment of the song based on the address information.

In implementation, the terminal may send a downloading request carrying the climax video segment to the server when detecting a trigger operation for downloading the song climax video segment according to the address information of the climax video segment, and the server downloads the corresponding song climax video segment from the server after receiving the request, so that the song climax video segment may be played at the terminal.

For example, the anchor user may click on the address information of the video segment at high tide corresponding to "say not cry" displayed in the form of a string of network address symbols, or the address information of the video segment at high tide corresponding to "say not cry" displayed in the form of a thumbnail of the video segment at high tide. And triggering an operation instruction for downloading the high-tide video segment of the song, sending a downloading request carrying the high-tide video segment to the server by the terminal, downloading the corresponding high-tide video segment of the song from the server after the server receives the request, and playing the high-tide video segment corresponding to the 'say no cry' at the terminal.

In implementation, after downloading the song climax video segment, the anchor user may share the song climax video segment to the circle of friends, or may publish the song climax video segment in the form of short video. The embodiment of the present application does not limit this.

In the above steps 401 to 413, reference may be made to fig. 7 for a method frame diagram provided by the embodiment of the present application, and fig. 7 is a frame diagram of a method for capturing a high-grade video segment of a song provided by the embodiment of the present application. The live video does not need to be imported into an additionally installed video editing application program, and then the high-tide video band is intercepted.

In a possible implementation manner, the process of the method for capturing a high-grade video segment of a song provided in the embodiment of the present application may also be: according to the target climax time period, after the server intercepts the song climax video segment from the live video, the server can also generate a thumbnail video of the song climax video segment and send the thumbnail video of the song climax video segment to the terminal. And the terminal receives the thumbnail video of the song climax video segment sent by the server, displays a downloading prompt window and plays the thumbnail video in the downloading prompt window. Then, when the terminal receives a confirmation instruction triggered by the download prompting window, the song climax video segment is downloaded based on the address information.

In implementation, after the server captures the song climax video segment in the live video, the song climax video segment may generate address information in the form of a thumbnail video of the song climax video segment, and then the server may obtain the address information in the form of the thumbnail video of the song climax video segment and send the thumbnail video of the song climax video segment to the terminal. When the terminal receives the thumbnail video of the song climax video segment sent by the server, a download prompting window may be displayed on the current terminal interface, as shown in fig. 8, where fig. 8 is an interface schematic diagram of the download prompting window of the method for capturing the song climax video segment provided in the embodiment of the present application. The download prompting window may be provided with a thumbnail video of a high-grade video segment of a song and a selection key corresponding to the thumbnail video, and the like, which is not limited in the embodiment of the present application. Then, the anchor user can trigger the operation of a selection key corresponding to the thumbnail video, and when the terminal receives a selection instruction triggered through the download prompting window, the terminal downloads the song climax video segment according to the address information corresponding to the song climax video segment. The climax video segment of the song can be played at the terminal.

In a possible implementation manner, the process of the method for capturing a high-grade video segment of a song provided in the embodiment of the present application may further be: according to the target climax time period, after the server intercepts the song climax video segments from the live video, a thumbnail video of each song climax video segment can be generated, and the thumbnail video of each song climax video segment is sent to the terminal. Then, the terminal may receive the thumbnail videos of the plurality of song climax video segments sent by the server, display a download prompting window, and display options corresponding to the plurality of thumbnail videos in the download prompting window, and play a first thumbnail video of the plurality of thumbnail videos when receiving a selection instruction for the first thumbnail video. And when the terminal receives a downloading instruction corresponding to the second thumbnail video, downloading the song climax video segment corresponding to the second thumbnail video according to the address information of the second thumbnail video.

The target song audio data corresponds to a plurality of target climax time periods, the live broadcast video can comprise a plurality of song climax video segments, and the address information of the song climax video segments can comprise the address information of the plurality of song climax video segments. The server can determine a plurality of target climax time periods corresponding to the target song audio data according to the inter-sentence time points in the target song audio data, so that the server can acquire a plurality of song climax video segments.

In implementation, after the server captures the song climax video segments in the live video, the song climax video segments may generate a thumbnail video of each song climax video segment, and then the server may send the obtained address information of each song climax video segment in the form of the thumbnail video to the terminal. When the terminal receives the thumbnail videos of the multiple high-grade song video segments sent by the server, a download prompting window may be displayed on the current terminal interface, as shown in fig. 9, where fig. 9 is an interface schematic diagram of the download prompting window of the method for capturing high-grade song video segments provided in the embodiment of the present application. The download prompting window can be provided with a plurality of options corresponding to the thumbnail videos, a selection key corresponding to each thumbnail video and the like, so that more selections are provided for the anchor user, and the experience of the anchor user is improved. Then, the anchor user can trigger the operation of a selection key corresponding to any thumbnail video.

For example, when the terminal receives a selection instruction of a first thumbnail video in the plurality of thumbnail videos triggered through the download prompting window, the song climax video segment corresponding to the first thumbnail video is downloaded according to the address information corresponding to the corresponding song climax video segment. And playing the high-grade video segment of the song corresponding to the first thumbnail video at the terminal. And when the terminal receives a selection instruction of a second thumbnail video in the plurality of thumbnail videos triggered by the download prompt window, downloading the song climax video segment corresponding to the second thumbnail video according to the address information corresponding to the song climax video segment corresponding to the second thumbnail video. And playing the song climax video segment corresponding to the second thumbnail video at the terminal.

The anchor user can trigger one selection instruction in the plurality of thumbnail videos and can also trigger a plurality of selection instructions in the plurality of thumbnail videos. The embodiment of the present application does not limit this.

according to the method for capturing the high-tide video band of the song, a video clip application program is not required to be started, live video in the singing process is guided into the video clip application program, and the high-tide video band is captured. The method provided by the embodiment of the application has simple steps and improves the intercepting efficiency of the video segments of high climax.

An embodiment of the present application provides a device for intercepting a video segment of high genre of songs, where the device may be a terminal in the foregoing embodiment, as shown in fig. 10, and fig. 10 is a schematic structural diagram of the device for intercepting a video segment of high genre of songs, and the device includes:

a first sending module 1001, configured to play accompaniment audio data of a target song when a singing instruction corresponding to the target song is received in a live video broadcast process, and send a singing start notification to a server;

a second sending module 1002, configured to send a singing ending notification to the server when the playing of the accompaniment audio data is finished;

a first receiving module 1003, configured to receive address information of a video segment of a song with a climax sent by the server;

a downloading module 1004 for downloading the song climax video segment based on the address information.

In one possible implementation, after the sending of the singing end notification to the server, the apparatus further includes:

the downloading module 1004 is further configured to:

the display module is used for displaying a download prompting window, and displaying options corresponding to a plurality of thumbnail videos in the download prompting window;

the downloading module 1004 is further configured to:

An embodiment of the present application provides a device for intercepting a video segment of a high genre of songs, where the device may be a server in the foregoing embodiment, as shown in fig. 11, and fig. 11 is a schematic structural diagram of the device for intercepting a video segment of a high genre of songs, and the device includes:

a receiving module 1101, configured to receive a singing start notification and a singing end notification sent by a terminal in a live video broadcast process;

an obtaining module 1102, configured to obtain a live video uploaded by the terminal in a time period in which the singing start notification is received and the singing end notification is received, and obtain audio data in the live video;

the searching module 1103 is configured to search, in the song audio data stored in the song library, target song audio data with the highest matching degree with the audio data;

a determining module 1104, configured to determine a target climax time period corresponding to the target song audio data according to a correspondence between pre-stored song audio data and the climax time period;

an intercepting module 1105, configured to intercept a song climax video segment in the live video based on the target climax time segment.

In one possible implementation, after intercepting a song climax video segment in the live video based on the target climax time segment, the apparatus further includes:

and the third sending module is used for sending the thumbnail video of the climax video segment of each song to the terminal.

It should be noted that: the device for intercepting high-grade video segments of songs provided by the above embodiment is exemplified by the division of the above functional modules, and in practical applications, the above function distribution can be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the above described functions. In addition, the device for intercepting high-grade video segments of songs and the method for intercepting high-grade video segments of songs provided by the embodiments belong to the same concept, and specific implementation processes are detailed in the method embodiments and are not described herein again.

The embodiment of the application provides a system for intercepting high-grade video segments of songs, which comprises a terminal and a server, wherein:

the terminal is used for playing accompaniment audio data of a target song and sending a singing start notice to the server when a singing instruction corresponding to the target song is received in the live video broadcasting process; when the playing of the accompaniment audio data is finished, sending a singing ending notice to the server; receiving address information of the song climax video segment sent by the server; downloading the high-grade video segment of the song based on the address information;

the server is used for receiving the singing starting notice and the singing ending notice which are sent by the terminal in the live video broadcasting process; acquiring a live broadcast video uploaded by the terminal in a time period of receiving the singing start notification and the singing end notification, and acquiring audio data in the live broadcast video; searching target song audio data with the highest matching degree with the audio data in song audio data stored in a song library; determining a target climax time period corresponding to the target song audio data according to a pre-stored corresponding relation between the song audio data and the climax time period; and intercepting a song climax video segment in the live video based on the target climax time segment.

Fig. 12 is a schematic structural diagram of a terminal according to an embodiment of the present application. The terminal 1200 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. Terminal 1200 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so forth.

In general, terminal 1200 includes: a processor 1201 and a memory 1202.

The processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 1201 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1201 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1201 may be integrated with a GPU (Graphics Processing Unit) that is responsible for rendering and drawing content that the display screen needs to display. In some embodiments, the processor 1201 may further include an AI (Artificial Intelligence) processor for processing a computing operation related to machine learning.

Memory 1202 may include one or more computer-readable storage media, which may be non-transitory. Memory 1202 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1202 is used to store at least one instruction for execution by processor 1201 to implement the song climax video segment capture method provided by the method embodiments of the present application.

In some embodiments, the terminal 1200 may further optionally include: a peripheral interface 1203 and at least one peripheral. The processor 1201, memory 1202, and peripheral interface 1203 may be connected by a bus or signal line. Various peripheral devices may be connected to peripheral interface 1203 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1204, touch display 1205, camera 1206, audio circuitry 1207, pointing component 1208, and power source 1209.

The peripheral interface 1203 may be used to connect at least one peripheral associated with I/O (Input/Output) to the processor 1201 and the memory 1202. In some embodiments, the processor 1201, memory 1202, and peripheral interface 1203 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 1201, the memory 1202 and the peripheral device interface 1203 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 1204 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuit 1204 communicates with a communication network and other communication devices by electromagnetic signals. The radio frequency circuit 1204 converts an electric signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electric signal. Optionally, the radio frequency circuit 1204 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 1204 may communicate with other terminals through at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 1204 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 1205 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 1205 is a touch display screen, the display screen 1205 also has the ability to acquire touch signals on or over the surface of the display screen 1205. The touch signal may be input to the processor 1201 as a control signal for processing. At this point, the display 1205 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 1205 may be one, providing the front panel of the terminal 1200; in other embodiments, the display 1205 can be at least two, respectively disposed on different surfaces of the terminal 1200 or in a folded design; in still other embodiments, the display 1205 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 1200. Even further, the display screen 1205 may be arranged in a non-rectangular irregular figure, i.e., a shaped screen. The Display panel 1205 can be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), or other materials.

Camera assembly 1206 is used to capture images or video. Optionally, camera assembly 1206 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 1206 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

The audio circuitry 1207 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals into the processor 1201 for processing or inputting the electric signals into the radio frequency circuit 1204 to achieve voice communication. For stereo capture or noise reduction purposes, multiple microphones may be provided at different locations of terminal 1200. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 1201 or the radio frequency circuit 1204 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 1207 may also include a headphone jack.

The positioning component 1208 is configured to locate a current geographic Location of the terminal 1200 to implement navigation or LBS (Location Based Service). The Positioning component 1208 can be a Positioning component based on the united states GPS (Global Positioning System), the chinese beidou System, the russian graves System, or the european union galileo System.

The power supply 1209 is used to provide power to various components within the terminal 1200. The power source 1209 may be alternating current, direct current, disposable or rechargeable. When the power source 1209 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, terminal 1200 also includes one or more sensors 1210. The one or more sensors 1210 include, but are not limited to: acceleration sensor 1211, gyro sensor 1212, pressure sensor 1213, fingerprint sensor 1214, optical sensor 1215, and proximity sensor 1216.

The acceleration sensor 1211 can detect magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 1200. For example, the acceleration sensor 1211 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 1201 may control the touch display 1205 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 1211. The acceleration sensor 1211 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 1212 may detect a body direction and a rotation angle of the terminal 1200, and the gyro sensor 1212 may collect a 3D motion of the user on the terminal 1200 in cooperation with the acceleration sensor 1211. The processor 1201 can implement the following functions according to the data collected by the gyro sensor 1212: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensors 1213 may be disposed on a side bezel of terminal 1200 and/or an underlying layer of touch display 1205. When the pressure sensor 1213 is disposed on the side frame of the terminal 1200, the user's holding signal of the terminal 1200 can be detected, and the processor 1201 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 1213. When the pressure sensor 1213 is disposed at a lower layer of the touch display screen 1205, the processor 1201 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 1205. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 1214 is used for collecting a fingerprint of the user, and the processor 1201 identifies the user according to the fingerprint collected by the fingerprint sensor 1214, or the fingerprint sensor 1214 identifies the user according to the collected fingerprint. When the user identity is identified as a trusted identity, the processor 1201 authorizes the user to perform relevant sensitive operations, including unlocking a screen, viewing encrypted information, downloading software, paying, changing settings, and the like. The fingerprint sensor 1214 may be provided on the front, back, or side of the terminal 1200. When a physical button or vendor Logo is provided on the terminal 1200, the fingerprint sensor 1214 may be integrated with the physical button or vendor Logo.

The optical sensor 1215 is used to collect the ambient light intensity. In one embodiment, the processor 1201 may control the display brightness of the touch display 1205 according to the ambient light intensity collected by the optical sensor 1215. Specifically, when the ambient light intensity is high, the display brightness of the touch display panel 1205 is increased; when the ambient light intensity is low, the display brightness of the touch display panel 1205 is turned down. In another embodiment, processor 1201 may also dynamically adjust the camera head 1206 shooting parameters based on the ambient light intensity collected by optical sensor 1215.

A proximity sensor 1216, also known as a distance sensor, is typically disposed on the front panel of the terminal 1200. The proximity sensor 1216 is used to collect a distance between the user and the front surface of the terminal 1200. In one embodiment, when the proximity sensor 1216 detects that the distance between the user and the front surface of the terminal 1200 gradually decreases, the processor 1201 controls the touch display 1205 to switch from the bright screen state to the dark screen state; when the proximity sensor 1216 detects that the distance between the user and the front surface of the terminal 1200 gradually becomes larger, the processor 1201 controls the touch display 1205 to switch from the breath screen state to the bright screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 12 is not intended to be limiting of terminal 1200 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

Fig. 13 is a schematic structural diagram of a server 1300 according to an embodiment of the present application, where the server 1300 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 1301 and one or more memories 1302, where the memory 1302 stores at least one instruction, and the at least one instruction is loaded and executed by the processor 1301 to implement the method for capturing a video segment of a high-grade song provided by each method embodiment. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

In an exemplary embodiment, a computer-readable storage medium, such as a memory, including instructions executable by a processor in a terminal to perform the song climax video segment capturing method in the above embodiments is also provided. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for intercepting a video segment of a high tide of a song, the method comprising:

when the playing of the accompaniment audio data is finished, sending a singing ending notice to the server so that the server obtains a live broadcast video from the moment of receiving the singing starting notice to the moment of receiving the singing ending notice, obtaining audio data in the live broadcast video, searching target song audio data with the highest matching degree with the audio data in song audio data stored in a song library, determining a target climax time period corresponding to the target song audio data according to the corresponding relation between the prestored song audio data and the climax time period, and intercepting a song climax video segment in the live broadcast video based on the target climax time period; generating a thumbnail video of the song climax video segment, and sending the thumbnail video of the song climax video segment to a terminal;

2. The method according to claim 1, wherein said downloading the video segment of the song climax based on the address information comprises:

3. The method according to claim 1, wherein the address information of the song climax video segment includes address information of a plurality of song climax video segments, and after the transmission of the singing end notification to the server, the method further comprises:

and when a downloading instruction corresponding to a second thumbnail video is received, downloading a song climax video segment corresponding to the second thumbnail video based on the address information of the second thumbnail video.

4. A method for intercepting a video segment of a high tide of a song, the method comprising:

intercepting a song climax video segment in the live video based on the target climax time segment;

generating a thumbnail video of the song climax video segment;

5. The method according to claim 4, wherein after intercepting a song climax video segment in the live video based on the target climax time period, the method further comprises:

6. The method according to claim 5, wherein the target song audio data corresponds to a plurality of target climax time periods, the live video comprises a plurality of song climax video segments, and the method further comprises, after intercepting a song climax video segment in the live video based on the target climax time periods:

generating a thumbnail video of the climax video section of each song;

7. An apparatus for intercepting a video segment of a high genre of songs, the apparatus comprising:

a second sending module, configured to send a singing ending notification to the server when the playing of the accompaniment audio data is completed, so that the server obtains a live video from a time when the singing starting notification is received to a time when the singing ending notification is received, obtains audio data in the live video, searches for target song audio data with a highest matching degree with the audio data in song audio data stored in a song library, determines a target climax time period corresponding to the target song audio data according to a correspondence between prestored song audio data and climax time periods, and intercepts a song climax video segment in the live video based on the target climax time period; generating a thumbnail video of the song climax video segment, and sending the thumbnail video of the song climax video segment to a terminal;

8. The apparatus of claim 7, wherein the download module is further configured to:

9. The apparatus according to claim 7, wherein the address information of the song climax video segment includes address information of a plurality of song climax video segments, and after the transmission of the singing end notification to the server, the apparatus further comprises:

the download module is further configured to:

10. An apparatus for intercepting a video segment of a high genre of songs, the apparatus comprising:

the intercepting module is used for intercepting a song climax video segment from the live video based on the target climax time segment;

11. The apparatus according to claim 10, wherein after intercepting a song climax video segment in the live video based on the target climax time segment, the apparatus further comprises:

12. The apparatus according to claim 10, wherein the target song audio data corresponds to a plurality of target climax time periods, a plurality of song climax video segments are included in the live video, and the apparatus further comprises, after the song climax video segments are intercepted in the live video based on the target climax time periods:

13. A system for intercepting high-grade video segments of songs, the system comprising a terminal and a server, wherein:

the terminal is used for playing accompaniment audio data of a target song and sending a singing starting notice to the server when a singing instruction corresponding to the target song is received in the live video broadcasting process; when the playing of the accompaniment audio data is finished, sending a singing ending notice to the server, and receiving a thumbnail video of the song climax video segment sent by the server; displaying a download prompting window, and playing the thumbnail video in the download prompting window; receiving address information of the song climax video segment sent by the server; downloading the high-grade video segment of the song based on the address information;

the server is used for receiving the singing starting notification and the singing ending notification sent by the terminal in the video live broadcasting process; acquiring live broadcast video uploaded by the terminal in a time period of receiving the singing starting notice and the singing ending notice, and acquiring audio data in the live broadcast video; searching target song audio data with the highest matching degree with the audio data in song audio data stored in a song library; determining a target climax time period corresponding to the target song audio data according to a corresponding relation between pre-stored song audio data and the climax time period; intercepting a song climax video segment in the live video based on the target climax time segment; generating a thumbnail video of the song climax video segment; and sending the thumbnail video of the song climax video segment to the terminal.

14. A computer device, comprising a processor and a memory, wherein at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to implement the operations performed by the method for intercepting a high-grade video segment of a song according to any one of claims 1 to 6.

15. A computer-readable storage medium having stored therein at least one instruction which is loaded and executed by a processor to perform operations performed by the method for capturing a high-grade video segment of a song according to any one of claims 1 to 6.