CN111447505B

CN111447505B - Video clipping method, network device, and computer-readable storage medium

Info

Publication number: CN111447505B
Application number: CN202010156612.5A
Authority: CN
Inventors: 钟宜峰; 乔美娜; 吴耀华; 李琳; 李鹏飞
Original assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Priority date: 2020-03-09
Filing date: 2020-03-09
Publication date: 2022-05-31
Anticipated expiration: 2040-03-09
Also published as: CN111447505A

Abstract

The embodiment of the invention relates to the technical field of multimedia, and discloses a video clipping method, network equipment and a computer readable storage medium, wherein the video clipping method comprises the following steps: acquiring trigger time of a user for performing special operation on a video to be clipped, wherein the special operation at least comprises one of approval, collection, forwarding, sharing and comment; determining the starting time and the ending time of a highlight segment in the video to be clipped according to the trigger time; and according to the starting time and the ending time, editing the audio and video segments corresponding to the wonderful segments from the video to be edited. The video clipping method, the network equipment and the computer readable storage medium provided by the invention can accurately acquire the preference of the user for watching the video and automatically clip the wonderful segment considered by the user, thereby facilitating the user to view the wonderful segment in the video.

Description

Video clipping method, network device, and computer-readable storage medium

Technical Field

The present invention relates to the field of multimedia technologies, and in particular, to a video editing method, a network device, and a computer-readable storage medium.

Background

With the development of video playing technology, more and more terminal devices are integrated with a video playing function, such as mobile phones, computers and the like, which can play videos, and especially can provide online video services for users. Currently, a method for a user to watch a video online by using a terminal device is as follows: the user retrieves videos from the video website by using the keywords and then clicks the retrieved videos for watching. In the prior art, the preference condition of the user for the live broadcast is obtained through factors such as the watching time length and the watching condition when the user watches the live broadcast.

The inventor finds that at least the following problems exist in the prior art: it is difficult to accurately judge the preference of the user to the live broadcast only through the watching duration and the appreciation condition of the user, and the user cannot quickly find and review the wonderful segment considered by the user after the live broadcast.

Disclosure of Invention

An object of embodiments of the present invention is to provide a video editing method, apparatus and computer-readable storage medium, which can accurately obtain a preference of a user for watching a video and automatically edit a highlight considered by the user, so that the user can conveniently view the highlight in the video.

In order to solve the above technical problem, an embodiment of the present invention provides a video clipping method, including:

acquiring trigger time for a user to perform special operation on a video to be edited, wherein the special operation at least comprises one of praise, collection, forwarding, sharing and comment; determining the starting time and the ending time of a highlight segment in the video to be clipped according to the trigger time; and according to the starting time and the ending time, editing the audio and video segments corresponding to the wonderful segments from the video to be edited.

An embodiment of the present invention further provides a network device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the video clip method described above.

Embodiments of the present invention also provide a computer-readable storage medium storing a computer program which, when executed by a processor, implements the video clipping method described above.

Compared with the prior art, the method and the device have the advantages that the trigger time for the user to perform special operation on the video to be edited can be obtained (for example, when the user watches the video to be edited, a praise button is arranged on a playing interface, the user can click the praise button when watching the highlight content of the video to be edited, and the time point for clicking the praise button is the trigger time for performing the special operation), so that the preference condition of the user on the watched video can be accurately known through whether the user performs the special operation; the method comprises the steps of determining the starting time and the ending time of a highlight in a video to be clipped according to the trigger time, so that the highlight which a user is interested in can be accurately obtained, clipping an audio and video clip corresponding to the highlight from the video to be clipped, automatically clipping the highlight, avoiding the situation that the highlight in the video cannot be automatically clipped and the operation process of the highlight of one video which the user wants to view is relatively complicated, automatically clipping the highlight which the user thinks, and facilitating the user to view the highlight in the video.

In addition, the determining the starting time and the ending time of the highlight segment in the video to be clipped according to the trigger time specifically comprises: and determining the highlight segment by taking the trigger time as the end time of the highlight segment and taking the first N seconds of the trigger time as the start time of the highlight segment, wherein N is a natural number greater than 0.

In addition, before taking the first N seconds of the trigger time as the start time of the highlight, the method further includes: judging whether the video to be clipped contains a key time point in a video segment from the first N seconds of the trigger time to the trigger time, wherein the key time point is a time point when new information appears in the video segment; when the key time point is judged not to be contained, executing the step of taking the first N seconds of the trigger time as the starting time of the highlight; and when the key time point is judged to be contained, taking the earliest key time point in the video segment as the starting time. By the method, each picture in the wonderful segment can be further ensured to be the picture required by the user, the situation that the wonderful segment contains unnecessary segments for a long time is avoided, and the watching experience of the user when the user watches the wonderful segment is improved.

In addition, the key time points include one of the following types or any combination thereof: the time point when a new person appears in the video segment, the time point when scene switching occurs, the time point when a lens is switched, the time point when a new animal or object appears, and the starting time point of each line of lines.

In addition, after the start time and the end time of the highlight segment in the video to be clipped are determined according to the trigger time, the method further comprises the following steps: acquiring a plurality of identification tags of a highlight within a determined time period and a trigger time of each identification tag, wherein the determined time period is a time period from the start time to the end time; providing the plurality of identification tags for a user, and acquiring a target tag selected by the user from the plurality of identification tags; and updating the start time of the highlight according to the trigger time of the target label. By the method, when the user carries out the praise operation on the video next time, the highlight segment can be determined according to the updated start time of the highlight segment and the praise time, so that the clipped audio and video segment is more in line with the watching habit of the user, and the watching experience of the user is further improved.

In addition, the plurality of target tags are provided, and the updating the start time of the highlight according to the trigger time point corresponding to the target tag specifically includes: selecting the target label with the earliest trigger time point from the plurality of target labels as an update label; and updating the start time of the highlight segment according to the trigger time point corresponding to the updating label.

In addition, the start time of the highlight is updated according to the following formula: t ═ T- (T-T)^`) K; wherein T is the difference between the start time of the highlight and the trigger time, T^`The difference between the trigger time point corresponding to the target label and the trigger timeThe value k is a constant greater than 0 and less than 1.

In addition, after obtaining a plurality of tags in the highlight and a plurality of trigger time points corresponding to the plurality of tags, the method further includes: judging whether the same label exists in the plurality of labels; when the same label is judged to exist, keeping the label with the earliest trigger time point in the same label, and removing other labels except the label with the earliest trigger time point in the same label. By the method, the watching experience of the user can be further improved.

Drawings

One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.

FIG. 1 is a flow chart of a video clipping method provided in accordance with a first embodiment of the present invention;

FIG. 2 is a flow chart of a video clipping method provided in accordance with a second embodiment of the invention;

FIG. 3 is a flow chart of a video clipping method provided in accordance with a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of a network arrangement provided according to a fourth embodiment of the present invention.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present invention in its various embodiments. However, the technical solution claimed in the present invention can be implemented without these technical details and various changes and modifications based on the following embodiments.

A first embodiment of the present invention relates to a video clipping method, and a specific flow is shown in fig. 1, where the method includes:

s101: and acquiring the trigger time of the user for carrying out special operation on the video to be clipped.

In step S101, specifically, the special operation at least includes one of approval, collection, forwarding, sharing, and comment, taking the special operation as "approval," as an example, the backend server sets an approval button for approving any content in the video file to be edited, provided by the user, in the video file to be edited, and the user approves the content in the video file to be edited by triggering the approval button. For each praise of the user, the background server can acquire a corresponding praise record, and the praise record records the time of the occurrence of the praise, that is to say, the background server can obtain the trigger time of the praise record in the video to be edited by searching the praise record; it should be noted that, for each approval of the user, the backend server may not generate an approval record, and directly use the trigger time of the approval operation as the start or end time of the highlight segment.

It should be noted that the video to be clipped in the present embodiment may be a live video or a recorded video (such as a movie downloaded by a user, a tv show, etc.), and the present embodiment does not specifically limit the type of the video to be clipped.

S102: and determining the starting time and the ending time of the highlight in the video to be clipped according to the trigger time.

In step S102, specifically, the trigger time of the like record corresponds to the playing progress of the video file to be edited, for example, when the user a watches the video file to be edited, and when the user a watches the video file for the fifth minute, the user a triggers the like function button on the video file to be edited, and generates the like record, the like record includes the playing progress of the audio and video file when the like function button is triggered, that is, the video file is played for the fifth minute, and the playing progress (5 th minute) is the trigger time of the like record.

It should be noted that, in this embodiment, determining the highlight segments in the video to be clipped according to the trigger time may be: and determining the highlight segment by taking the trigger time as the end time of the highlight segment and taking the first N seconds of the trigger time as the start time of the highlight segment, wherein N is a natural number greater than 0. That is, in this embodiment, a time interval may be preset, the trigger time may be set as the end time of the highlight, and the time N seconds from the end time may be set as the start time of the highlight. This is because users are generally accustomed to praise after finishing watching the highlight, and thus the trigger time for praise recording is generally the end time of the highlight.

It can be understood that, in this embodiment, the trigger time may also be used as the start time of the highlight, and the last N seconds of the trigger time may be used as the end time of the highlight, so as to achieve the same technical effect.

S103: and editing the audio and video clips corresponding to the highlight clips from the video to be edited according to the starting time and the ending time.

In step S103, specifically, after the audio/video segment is clipped from the video to be clipped, the background server may further perform the following operations to improve the review experience of the user: 1. intercepting a segment related to at least one actor from a video to be edited according to the actor role in the audio and video segment; 2. and intercepting other audio and video segments corresponding to at least one section of complete speech from the video to be edited according to the integrity of the speech in the audio and video segments.

It should be noted that after the audio/video clip is obtained, the audio/video clip can be provided to a user watching a video to be clipped, and can also be pushed to other users; the clipped audio and video clips can be stored locally or in a background server, and a user can search for the audio and video clips clipped before approval and after approval.

Compared with the prior art, the method and the device have the advantages that the trigger time for the user to perform special operation on the video to be edited can be obtained (for example, when the user watches the video to be edited, a praise button is arranged on a playing interface, the user can click the praise button when watching the highlight content of the video to be edited, and the time point for clicking the praise button is the trigger time for performing the special operation), so that the preference condition of the user on the watched video can be accurately known according to the special operation; the method comprises the steps of determining the starting time and the ending time of a highlight in a video to be clipped according to the trigger time, so that the highlight which a user is interested in can be accurately obtained, clipping an audio and video clip corresponding to the highlight from the video to be clipped, automatically clipping the highlight, avoiding the situation that the highlight in the video cannot be automatically clipped, and the operation process that the user wants to view the highlight of one video is relatively complicated, and automatically clipping the highlight which the user is interested in, so that the user can conveniently view the highlight in the video.

The second embodiment of the invention relates to a video clipping method, and the second embodiment is a further improvement on the first embodiment, and the specific improvement is that: in the second embodiment, it is also determined whether the video to be clipped contains the key time point in the video segment from M seconds before the trigger time to the trigger time, and when it is determined that the key time point is contained, the earliest occurring key time point is taken as the start time of the highlight. By the method, each picture in the wonderful segment can be further ensured to be the picture required by the user, the situation that the wonderful segment contains unnecessary segments for a long time is avoided, and the watching experience of the user when the user watches the wonderful segment is improved.

As shown in fig. 2, a specific flow of the present embodiment includes:

s201: and acquiring the trigger time of the user for carrying out special operation on the video to be clipped.

S202: taking the trigger time as the end time of the highlight segment, judging whether the video to be edited contains a key time point in the video segment from M seconds before the trigger time to the trigger time, if so, executing the step S203; if not, go to step S204.

In step S202, specifically, the size of "M" in the first M seconds of the trigger time in the present embodiment may be the same as or different from the size of "N" in the first N seconds of the trigger time in the first embodiment, and the size of M is not specifically limited in the present embodiment; the key time point is a time point when new information appears in the video segment, and the key time point comprises one of the following types or any combination thereof: the time point when a new person appears in the video segment, the time point when scene switching occurs, the time point when a lens is switched, the time point when a new animal or object appears, and the starting time point of each line of lines. More specifically, in the playing process of the video to be clipped, the background server identifies each frame of picture of the video to be clipped in real time, the identification content includes the following, and all identification results are stored in a tag form with a timestamp:

(1) a character: judging the character appearing in the video picture through face recognition, and recording the time point of the new character appearing in the picture (compared with the previous frame) as a key time point; (2) scene: and judging scenes displayed by the video pictures, such as offices, basketball courts, coffee houses and the like, through scene identification. The time point when the picture scene is switched is recorded as a key time point; (3) lens: judging the switching time point of the lens through lens detection, and recording the switching time point of the lens as a key time point; (4) animal/object: judging the animals/objects appearing in the picture through target detection, and recording the time point of the new animals/objects appearing in the picture (compared with the previous frame) as a key time point; (5) the lines: the content of the lines and the speaker are obtained through voice recognition, and the starting time point of each line of lines is marked as a key time point.

S203: and determining the highlight by taking the earliest appearing key time point in the video segment as the starting time of the highlight.

Regarding step S203, specifically, taking the video segment as the first 15 seconds of the trigger time of the praise recording as an example, if all the video segments are the same scene within 0 to 5 seconds and no new character or character conversation occurs, the video segment within the time period of 0 to 5 seconds can be regarded as an unnecessary segment, that is, the user will not take the unnecessary segment as a part of the highlight segment with a high probability, so that by this way, the highlight segment can be prevented from including the unnecessary segment for a long time, and the viewing experience of the user when reviewing the highlight segment is improved.

S204: and taking the first M seconds of the trigger time as the start time of the highlight, and determining the highlight.

S205: and editing the audio and video segments corresponding to the highlight segments from the video to be edited.

Steps S201, S204 to S205 in this embodiment are similar to steps S101 to S103 in the first embodiment, and are not repeated here to avoid repetition.

For convenience of understanding, the following specifically exemplifies an application scenario of the video clipping method in the present embodiment, taking M equal to 15 as an example:

in the process of watching a video, a user carries out an approval operation when the video is played to the 20 th second in the 5 th minute, the background server stores the triggering time (namely, the 20 th second in the 5 th minute) of the approval record, the 20 th second in the 5 th minute is taken as the end time of a highlight, the background server judges whether a key time point exists in a video segment of the first 15 seconds (namely, the 5 th second in the 5 th minute) in the 20 th second in the 5 th minute, if not, the 5 th second in the 5 th minute is taken as the start time of the highlight, and the video segment of the 5 th second to the 20 th second in the 5 th minute is taken as the highlight; if the key time point which appears earliest is the 5 th minute and 10 th second, the 5 th minute and 10 th second is taken as the starting time of the highlight, and the video clips from the 5 th minute and 10 th second to the 5 th minute and 20 th second are saved as the highlight.

The third embodiment of the invention relates to a video clipping method, and is a further improvement on the second embodiment, and the specific improvement is that: in a third embodiment, after acquiring a plurality of tags in the highlight and a plurality of generation time points corresponding to the plurality of tags, the method further includes extracting the tags in the highlight, performing deduplication on the tags, providing the deduplicated tags for a user to select, and after selecting the tags, the user updates the start time of the highlight according to the earliest tag in the selected tags. By the method, when the user carries out the praise operation on the video next time, the highlight segment can be determined according to the updated start time of the highlight segment and the praise time, so that the clipped audio and video segment is more in line with the watching habit of the user, and the watching experience of the user is further improved.

As shown in fig. 3, a specific flow of the present embodiment includes:

s301: and acquiring the trigger time of the user for carrying out special operation on the video to be clipped.

S302: taking the trigger time as the end time of the highlight segment, judging whether the video to be edited contains a key time point in the video segment from the first N seconds of the trigger time to the trigger time, if so, executing the step S303; if not, go to step S304.

S303: and determining the highlight by taking the earliest appearing key time point in the video segment as the starting time of the highlight.

S304: and taking the first N seconds of the trigger time as the starting time of the highlight, and determining the highlight.

S305: and (4) clipping an audio and video clip corresponding to the highlight clip from the video to be clipped, and providing the audio and video clip for the user.

S306: and carrying out label identification on the wonderful segment to obtain a plurality of labels in the wonderful segment and a plurality of trigger time points corresponding to the labels.

Regarding step S306, specifically, in this embodiment, performing label identification on the highlight in the highlight may be understood as performing label identification on each frame of picture in the highlight, that is, identifying a person, a scene, an animal, or another object appearing in each frame of picture, where each person, scene, animal, or other object in each frame of picture corresponds to one label.

S307: and judging whether the same tags exist in the plurality of tags, and if so, reserving the tags with the earliest trigger time point in the same tags and removing the tags except the tags with the earliest trigger time point in the same tags.

Regarding step S307, specifically, after the highlight is obtained, the tag of the highlight needs to be de-duplicated, which is because if a person a in the highlight always appears, there are many tags of the person a, but it is necessary to ensure that only the tag of the person a with the earliest trigger time point is finally displayed to the user, thereby further improving the viewing experience of the user.

S308: and providing the plurality of deduplicated labels for a user, and determining a target label selected by the user from the plurality of deduplicated labels.

In step S308, specifically, when the multiple de-duplicated labels are provided to the user, the de-duplicated labels may be sorted from late to early according to the trigger time point, and displayed to the user according to the order, so that the user can select the target label conveniently, and the user is prevented from spending a long time to select the target label due to disorder label sorting.

S309: and updating the start time of the highlight segment according to the trigger time point corresponding to the target label.

In step S309, specifically, the number of target tags selected by the user may also be multiple, and when the user selects multiple target tags, the background server selects a target tag with the earliest trigger time point among the multiple target tags as an update tag, and then updates the start time of the highlight according to the trigger time point corresponding to the update tag.

It should be noted that, in the present embodiment, the start time of the highlight is updated according to the following formula:

T＝T-(T-T^`) K; wherein T is the difference between the start time of the highlight and the trigger time, T^`And k is a constant which is larger than 0 and smaller than 1 and is the difference value between the trigger time point corresponding to the target label and the trigger time. It is to be understood that the size of the update coefficient k is not particularly limited in this embodiment, and the value of k is preferably 0.2.

For convenience of understanding, the following specifically exemplifies an application scenario of the video clipping method in the present embodiment, taking N equal to 15 as an example:

in the process of watching a video, a user carries out an approval operation when the video is played to the 20 th second in the 5 th minute, the background server stores the triggering time (namely, the 20 th second in the 5 th minute) of the approval record, the 20 th second in the 5 th minute is taken as the end time of a highlight, the background server judges whether a key time point exists in a video segment of the first 15 seconds (namely, the 5 th second in the 5 th minute) in the 20 th second in the 5 th minute, if not, the 5 th second in the 5 th minute is taken as the start time of the highlight, and the video segment of the 5 th second to the 20 th second in the 5 th minute is taken as the highlight; if the key time point which appears earliest is the 5 th minute and 10 th second, the 5 th minute and 10 th second are taken as the start time of the highlight, and the video clips from the 5 th minute and 10 th second to the 5 th minute and 20 th second are saved as the highlight.

Taking the video clips from the 10 th second to the 20 th second in the 5 th minute as the highlight clips, performing label identification on each frame of picture of the video clips from the 10 th minute to the 5 th minute by the background server to obtain a plurality of labels, performing de-duplication on the plurality of labels, assuming that the labels after de-duplication are character A, object B and scene C, and the character A, the object B and the scene C all have a corresponding trigger time point, sending the character A, the object B and the scene C to the user by the background server, assuming that the character A and the scene C are selected by the user, assuming that the trigger time point of the character A is 5 minutes and 12 seconds, and the trigger time point of the scene C is 5 minutes and 14 seconds,taking 5 minutes and 12 seconds as the trigger time point of the target tag, wherein the initial T is 5 minutes and 20 seconds to 5 minutes and 10 seconds to 10 seconds, and T^`If the time T is 5 minutes 20 seconds to 5 minutes 12 seconds to 8 seconds, the updated time T is 10 seconds- (10 seconds to 8 seconds) × 0.2 to 9.6 seconds, that is, the background server will save the video clip of the first 9.6 seconds of the trigger time of the approval recording as the highlight clip when the user approves next time.

A fourth embodiment of the present invention relates to a network device, as shown in fig. 4, including:

at least one processor 401; and the number of the first and second groups,

a memory 402 communicatively coupled to the at least one processor 401; wherein the content of the first and second substances,

the memory 402 stores instructions executable by the at least one processor 401 to be executed by the at least one processor 401 to enable the at least one processor 401 to perform the video clipping method described above.

Where the memory 402 and the processor 401 are coupled by a bus, which may include any number of interconnected buses and bridges that couple one or more of the various circuits of the processor 401 and the memory 402 together. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 401 may be transmitted over a wireless medium via an antenna, which may receive the data and transmit the data to the processor 401.

The processor 401 is responsible for managing the bus and general processing and may provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory 402 may be used to store data used by processor 401 in performing operations.

A fifth embodiment of the present invention relates to a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.

That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.

Claims

1. A video clipping method, comprising:

acquiring trigger time for a user to perform special operation on a video to be edited, wherein the special operation at least comprises one of praise, collection, forwarding, sharing and comment;

determining the starting time and the ending time of a highlight segment in the video to be clipped according to the trigger time;

according to the starting time and the ending time, audio and video segments corresponding to the wonderful segments are clipped from the video to be clipped;

after the start time and the end time of the highlight segment in the video to be clipped are determined according to the trigger time, the method further comprises the following steps:

acquiring a plurality of identification tags of a highlight within a determined time period and a trigger time of each identification tag, wherein the determined time period is a time period from the start time to the end time;

judging whether the same tags exist in the plurality of tags, reserving the tags with the earliest trigger time point in the same tags and removing the tags except the tags with the earliest trigger time point in the same tags when judging that the same tags exist;

providing the plurality of identification labels subjected to de-duplication for a user, and acquiring a target label selected by the user from the plurality of identification labels;

and updating the start time of the wonderful segment according to the trigger time of the target label.

2. The video clipping method according to claim 1, wherein the determining the start time and the end time of the highlight segment in the video to be clipped according to the trigger time specifically comprises:

and determining the wonderful segment by taking the trigger time as the end time and taking the first N seconds of the trigger time as the start time, wherein N is a natural number greater than 0.

3. The video clipping method according to claim 2, wherein before taking the first N seconds of the trigger time as the start time, further comprising:

judging whether the video to be clipped contains a key time point in a video segment from the first N seconds of the trigger time to the trigger time, wherein the key time point is a time point when new information appears in the video segment;

when the key time point is judged not to be contained, executing the step of taking the first N seconds of the trigger time as the starting time;

and when the key time point is judged to be contained, taking the earliest key time point in the video segment as the starting time.

4. The video clipping method according to claim 3, wherein the key time points comprise one or any combination of the following types:

the time point when a new figure appears in the video segment to be edited, the time point when scene switching occurs, the time point when a shot is switched, the time point when a new animal or object appears, and the starting time point of each line of lines.

5. The method for video clipping according to claim 1, wherein the number of the target tags is plural, and the updating the start time of the highlight according to the trigger time of the target tag specifically includes:

selecting a target label with the earliest trigger time from the plurality of target labels as an update label;

and updating the start time of the wonderful segment according to the trigger time of the updating label.

6. The video clipping method according to claim 1 or 5, wherein the start time of the highlight is updated according to the following formula:

T=T-（T-T^`）*k；

wherein T is the starting time and the triggering timeDifference between, T^`And k is a constant which is larger than 0 and smaller than 1 and is a difference value between the generation time point corresponding to the target label and the trigger time.

7. A network device, comprising: at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein, the first and the second end of the pipe are connected with each other,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the video clipping method of any one of claims 1 to 6.

8. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the video clipping method of any one of claims 1 to 6.