CN114666637B

CN114666637B - Video editing method, audio editing method and electronic equipment

Info

Publication number: CN114666637B
Application number: CN202210234347.7A
Authority: CN
Inventors: 周凡皙; 张晟
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2022-03-10
Filing date: 2022-03-10
Publication date: 2024-02-02
Anticipated expiration: 2042-03-10
Also published as: CN114666637A

Abstract

The embodiment of the application provides a video editing method, an audio editing method and electronic equipment. Wherein, the video editing method comprises the following steps: displaying a plurality of sentences and clipping operation identifiers corresponding to the sentences respectively; the audio text synchronized with the video to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding video clips in the video to be clipped, and the clipping operation identification represents the clipping operation type of the user and is used for assisting the user to randomly delete or restore the video clips associated with the sentences; in response to a restoration operation triggered by a user selecting a pruned sentence with reference to a clipping operation identification, restoring a video segment associated with the pruned sentence. According to the technical scheme provided by the embodiment of the application, the user can pertinently restore the deleted sentences which are wanted to be restored, the editing operation is separated from other operations (such as sentence editing), the efficiency of restoring the deleted sentences is high, and the user experience is good.

Description

Video editing method, audio editing method and electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a video editing method, an audio editing method, and an electronic device.

Background

At present, videos are visible everywhere in our daily life, people can shoot videos anytime and anywhere, and then clip the shot videos to share the videos on a social platform or friends. For example, in live and short video scenes, there are a large number of practitioners who need to make live and oral videos with a large amount of narrative content.

For such videos, a more common method is to clip the video by text clipping means of subtitle recognition. The user may prune the corresponding video clip by pruning the caption sentence. However, the existing clipping method has low clipping flexibility, for example, assuming that the user clips three times, the user finds that the first deleted caption sentence is unsuitable for recovering. The user needs to withdraw all three operations and re-clip.

Disclosure of Invention

In view of the problems in the prior art, embodiments of the present application provide a video editing method, an audio editing method, and an electronic device with high editing flexibility.

In one embodiment of the present application, a video editing method is provided. The method comprises the following steps:

displaying a plurality of sentences and clipping operation identifiers corresponding to the sentences respectively; the audio text synchronized with the video to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding video clips in the video to be clipped, and the clipping operation identification represents the clipping operation type of the user and is used for assisting the user to randomly delete or restore the video clips associated with the sentences;

In response to a restoration operation triggered by a user selecting a pruned sentence with reference to a clipping operation identification, restoring a video segment associated with the pruned sentence.

In another embodiment of the present application, a video editing method is provided. The method comprises the following steps:

displaying a plurality of deleted sentences and recovery marks corresponding to the deleted sentences respectively; wherein the audio text synchronized with the video to be clipped contains the plurality of pruned sentences, and the plurality of pruned sentences are respectively associated with corresponding video clips in the video to be clipped;

and responding to the operation triggered by the recovery mark corresponding to any target sentence in the plurality of deleted sentences by the user, and recovering the video clips associated with the target sentences.

In yet another embodiment of the present application, a video editing method is provided. The method comprising the following steps:

determining display modes corresponding to the sentences respectively based on clipping operation attributes corresponding to the sentences respectively; the audio text synchronized with the video to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding video clips in the video to be clipped, and the clipping operation attribute represents the clipping operation type of a user;

And responding to a restoration operation triggered after the user selects a deleted sentence according to the display mode, restoring the video fragment associated with the deleted sentence, and adjusting the clipping operation attribute corresponding to the deleted sentence to change the display mode.

In yet another embodiment of the present application, an audio clipping method is provided. The method comprising the following steps:

displaying a plurality of sentences and clipping operation identifiers corresponding to the sentences respectively; the audio text synchronized with the audio to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding audio fragments in the audio to be clipped, and the clipping operation identification represents the clipping operation type of the user and is used for assisting the user to randomly delete or restore the audio fragments associated with the sentences;

in response to a restoration operation triggered by a user selecting a pruned sentence with reference to a clipping operation identification, restoring an audio clip associated with the pruned sentence.

In yet another embodiment of the present application, an electronic device is provided. The electronic device comprises a processor and a memory, wherein the memory is used for storing one or more computer instructions; the processor, coupled to the memory, is configured to execute the one or more computer instructions for performing the steps of the method embodiments described above.

In the present application in one embodiment of the present invention, a computer program product is provided. The computer program product comprises a computer program or instructions which, when executed by a processor, cause the processor to carry out the steps of the method embodiments described above.

The present embodiments provide a computer-readable storage medium storing a computer program that, when executed by a computer, is capable of implementing the method steps or functions provided by the above embodiments.

According to the technical scheme provided by the embodiment of the application, the clipping operation identifiers are displayed for each sentence, and the clipping operation identifiers represent the clipping operation types of the users and are used for assisting the users to randomly delete or recover video clips associated with the sentences; the user can pertinently restore the deleted sentences which are wanted to be restored, the editing operation is separated from other operations (such as sentence editing), the efficiency of restoring the deleted sentences is high, and the user experience is good.

In another technical scheme provided by the embodiment of the application, corresponding clipping operation attributes are configured for each sentence, and display modes corresponding to different attributes are different; the user can distinguish between the sentences of the various states through different display modes, such as deleted statements, reserved statements; the user can pertinently restore the deleted sentences which are wanted to be restored, the editing operation is separated from other operations (such as sentence editing), the efficiency of restoring the deleted sentences is high, and the user experience is good.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description will be given below of the drawings that are needed in the embodiments or the prior art descriptions, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a video editing method according to an embodiment of the present disclosure;

FIG. 2a is a schematic view of a first state of a video editing interface implemented by the video editing method according to the embodiment shown in FIG. 1;

FIG. 2b is a schematic diagram of a second state of a video editing interface implemented by the video editing method according to the embodiment shown in FIG. 1;

FIG. 2c is a schematic view of a third state of a video editing interface implemented by the video editing method according to the embodiment shown in FIG. 1;

FIG. 3 is a flowchart of a video editing method according to another embodiment of the present application;

FIG. 4a is a schematic view of a first state of a video editing interface implemented by the video editing method according to the embodiment shown in FIG. 3;

FIG. 4b is a schematic view of a second state of a video editing interface implemented by the video editing method according to the embodiment shown in FIG. 3;

FIG. 5 is a flowchart of a video editing method according to another embodiment of the present application;

FIG. 6a is a schematic view of a first state of a video editing interface implemented by the video editing method according to the embodiment shown in FIG. 5;

FIG. 6b is a schematic view of a second state of a video editing interface implemented by the video editing method according to the embodiment shown in FIG. 5;

FIG. 7 is a flowchart of an audio editing method according to an embodiment of the present disclosure;

FIG. 8 is a schematic illustration of a clipping scheme provided by an embodiment of the present application;

FIG. 9 is a schematic diagram of a video editing apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The video clipping scheme based on subtitle identification enables a user to edit a video like editing a document, and delete corresponding video clips while deleting characters, and is mainly suitable for live broadcast and oral broadcast contents with a large amount of explanation contents. The existing video clip scheme based on subtitle recognition has the following defects:

1. The operation efficiency is low: after the user deletes the sentences for a plurality of times, if the user wants to recover the sentences deleted for a certain time, the user needs to withdraw the sentences sequentially once until recovering the sentences.

2. Failure to recover fragments one accurate selection: the revocation is effective for both the deletion operation and the subtitle editing operation, and the user needs to sequentially delete all the deletion operations, subtitle editing operations and the like before one revocation in order to revoke the deleted content for a certain time until the sentence is restored, so that the editing done by the user in the process is totally invalid.

3. The adjustment space is small: the content deleted by the user is not visible.

In view of at least some of the problems described above, the following examples of the present application are presented. In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application.

In some of the flows described in the specification, claims, and drawings described above, a plurality of operations occurring in a particular order are included, and the operations may be performed out of order or concurrently with respect to the order in which they occur. The sequence numbers of operations such as 101, 102, etc. are merely used to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types. Furthermore, the embodiments described below are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

Fig. 1 is a schematic flow chart of a video editing method according to an embodiment of the present application. The execution subject of the method provided in this embodiment may be a client device, such as a mobile phone, a desktop computer, a notebook computer, a tablet computer, an intelligent wearable device, etc., which is not limited in this embodiment. The client device is provided with a computer program product (such as a clipping application) capable of realizing the following method steps, or a functional module added in the existing application capable of realizing the following method steps, etc. Specifically, as shown in fig. 1, the method includes:

101. and displaying the clipping operation identifiers corresponding to the multiple sentences respectively.

The audio text synchronized with the video to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding video clips in the video to be clipped, and the clipping operation identification represents the clipping operation type of the user and is used for assisting the user to randomly delete or restore the video clips associated with the sentences.

102. In response to a restoration operation triggered by a user selecting a pruned sentence with reference to a clipping operation identification, restoring a video segment associated with the pruned sentence.

In the above 101, the clipping operation identifier has two kinds of presentation icons, which are respectively: a first icon indicating that the user has not pruned and prompting the user to make a pruning and a second icon indicating that the user has pruned and prompting the user to resume at any time. Correspondingly, the method provided by the embodiment may further include the following steps:

and responding to a recovery operation triggered after the user selects a deleted sentence with reference to the clip operation identification, and displaying the clip operation identification corresponding to the deleted sentence as the first icon by switching the second icon.

As shown in figures 2a, 2b and 2c, the first icon may be in the figureThe second icon may be +_ in the figure>Of course, other icons than those shown in the drawings are also possible, and the present embodiment is not limited in detail.

In another implementation solution, the manipulable operation identifier may not have a specific form, but a corresponding attribute. For example, the clipping operation identifier has two presentation properties, namely: a first attribute indicating that the user has not pruned and prompting the user to prune and a second attribute indicating that the user has pruned and prompting the user to recover at any time. The display modes corresponding to the attributes are different, so that sentences corresponding to the different attributes are displayed differently. For example, a sentence corresponding to the first attribute is normally displayed; and displaying gray scale corresponding to the sentence of the second attribute. Or, the sentence corresponding to the first attribute is normally displayed; sentences corresponding to the second attribute have strikethroughs displayed thereon, such as "hello". Or, the sentence corresponding to the first attribute is normally displayed; a sentence corresponding to the second attribute is displayed in a flashing manner; etc. This scheme will be described below with respect to the embodiments, and the clip operation identifier will be referred to as "clip operation attribute" hereinafter.

In 102 above, the recovery operation for a pruned sentence triggered by the user based on the clipping operation identification may be: clicking operation (shown in fig. 2 b) performed by the user on the clip operation identifier with the specific icon form; or the user performs right click on the corresponding sentence based on the clipping operation identification of the characterization display attribute without the specific form and clicks the corresponding option in the floating window (as shown in fig. 6 b); or the user directly completes corresponding recovery operation by clicking corresponding controls (such as a backspace or an enter key and the like) on the keyboard based on sentences corresponding to the clipping operation identifiers of the characterization display attributes without concrete forms. Of course, the controls on the keyboard may be manually set, the present embodiment is not limited thereto.

What is needed here is that: the video clipping refers to the process of generating a target video by performing processing such as segment interception, position arrangement, play speed adjustment, segment engagement effect adjustment and the like on video materials. The technical solution provided in this embodiment relates to clip operations such as clip retention, deletion, etc., by which video is clipped, the video clip position ordering, play speed adjustment, clip connection effect and the like are not involved.

According to the technical scheme provided by the embodiment, the clipping operation identifiers are displayed for each sentence, and the clipping operation identifiers represent the clipping operation types of the users and are used for assisting the users to randomly delete or recover video clips associated with the sentences; the user can pertinently restore the deleted sentences which are wanted to be restored, the editing operation is separated from other operations (such as sentence editing), the efficiency of restoring the deleted sentences is high, and the user experience is good.

Further, in the embodiment, the step 101 of displaying the plurality of sentences and the clipping operation identifiers corresponding to the plurality of sentences respectively may include:

1011. sequentially displaying the sentences according to the playing sequence of the video to be clipped;

1012. and respectively displaying the corresponding clipping operation identifiers on the corresponding rows of the plurality of sentences.

In this embodiment, only sentences of audio text may be displayed on the clipping interface, and a plurality of sentences may be displayed on the interface at the same time. The user can browse sentences not displayed on the interface by a slide operation. Or, besides a plurality of sentences, the editing interface can display and play the video to be edited. That is, in one possible embodiment, the method may further include:

103. Displaying a clipping interface; the editing interface comprises a video playing area and a sentence displaying area for displaying the sentences;

104. playing the video to be clipped in the video playing area;

105. and determining the sentences based on the playing progress of the video to be clipped.

As shown in the figure, the video playing area and the sentence displaying area on the clip interface may be arranged in a top-bottom manner, for example, a partial area on the clip interface is used as the video playing area, and a lower partial area is used as the sentence displaying area. Or, the area of the upper part 1/3 of the clipping interface is a video playing area, and the area of the lower part 2/3 is a sentence display area.

Accordingly, the step 1011 "sequentially displaying the plurality of sentences according to the playing order of the video to be clipped" may include:

and in the sentence display area, scrolling and displaying at least one sentence associated with the sentence matched with the playing progress and the video clip to be played after the playing progress.

Further, the method provided in this embodiment may further include the following steps:

106. determining a sliding track in response to a sliding operation of a user;

107. determining a plurality of sentences to be operated according to the sliding track;

108. And switching and displaying the clipping operation identifiers corresponding to the multiple sentences to be operated respectively so as to switch from the first icon to the second icon or from the second icon to the first icon.

109. And respectively performing clipping operation on the video clips related to the multiple sentences to be operated so as to delete or restore the corresponding video clips.

The above step 107 may specifically include the following three ways:

in the first mode, according to the starting position and the ending position of the sliding track, all sentences with the clipping operation identification of the second icon between the starting position and the ending position are determined to be the multiple sentences to be operated.

And secondly, determining all sentences of which the clipping operation mark is a first icon between the starting position and the ending position as the multiple sentences to be operated according to the starting position and the ending position of the sliding track.

And thirdly, determining all sentences between the starting position and the ending position as the multiple sentences to be operated according to the starting position and the ending position of the sliding track.

In the first embodiment, when step 108 is executed, the clip operation identifier corresponding to each sentence is uniformly switched from the second icon to the first icon, and when step 109 is executed, the video clips associated with each sentence are uniformly restored.

In the second embodiment, when step 108 is executed, the clip operation identifier corresponding to each sentence is uniformly switched from the first icon to the second icon, and when step 109 is executed, the video clip associated with each sentence is uniformly deleted.

In the third mode, when executing the step 108, the clip operation identifier corresponding to the first type sentence is switched from the first icon to the second icon, and the clip operation identifier corresponding to the second type sentence is switched from the second icon to the first icon (i.e. the opposite operation is uniformly performed); wherein, the first type of statement refers to: the original clipping operation identifies the sentence as the first icon (i.e., un-pruned); the second type of statement refers to: the original editing operation is identified as the second the sentence of the icon (i.e., pruned). And when step 109 is executed, deleting the video segments associated with the first class of sentences and recovering the video segments associated with the second class of sentences.

Further, the method comprises the steps of, the method provided by the embodiment may further include the following steps:

110. performing voice recognition and/or subtitle recognition on the audio synchronized with the video to be clipped to obtain the audio text;

111. responding to the editing operation of a user for one sentence to be modified in the displayed multiple sentences, and modifying the sentence to be modified based on the editing content of the user;

Wherein the modification to the statement includes at least one of: text content modification, text style modification, sentence breaking modification.

In addition, the video to be clipped in the embodiment may be a video from a live television program, a video on the internet side, or a video acquired by an image acquisition device. The present embodiment can determine the text information corresponding to the voice by performing voice recognition on the voice request by using an automatic voice recognition technology ASR (Automatic Speech Recognition), a voice recognition technology based on a deep learning technology, and the like. And/or text recognition of image content on video frames in a video clip, such as by optical character recognition (OCR, optical Character Recognition) techniques.

In this embodiment, the editing operation is separated from the editing operation, the editing operation type is characterized by the editing operation identifier, and each sentence corresponds to one editing operation identifier. The user refers to the clip operation to identify the operation on the sentence, and only the clip-related operation such as deletion, restoration. Text editing, i.e., editing operations on sentences, is independent of the clipping operations. Even if the user deletes the associated video clip by deleting the first sentence, then edits and modifies the fifth sentence, and then performs other clipping operations, etc. If the user wants to recover the deleted first sentence, the user can accurately recover the deleted first sentence without withdrawing the editing and modifying operation on the fifth sentence and other clipping operations performed by the user.

Referring to the examples shown in fig. 2a, 2b and 2c, after a user guides a video to be clipped into an Application (APP), there are two areas on a clip interface of the application, and an area located at the upper part is a video playing area for playing the video to be clipped, and a playing progress is displayed on the playing progress. The area at the lower part is a sentence display area, and the clipping operation identifier 1 is used for displaying a plurality of sentences and corresponding sentences. As shown in fig. 2a, when the video is to be clipped, before the user has not operated, the clipping operation identifier 1 corresponding to each sentence is a first icon (indicating that the user has not pruned and prompting the user to delete). The user triggers the deleting operation on part of the sentences in multiple times or once, as shown in fig. 2b, the second sentence, the third sentence, the fourth sentence, the seventh sentence and the eighth sentence are deleted by the user operation; correspondingly, the video segments associated with the second sentence, the video segments associated with the third sentence, the video segments associated with the fourth sentence, the video segments associated with the seventh sentence and the video segments associated with the eighth sentence are deleted from the video to be clipped. Wherein the second, third and fourth sentences can be deleted at one time by the sliding operation mentioned in the present embodiment. For example, the user slides from the second sentence to the fourth sentence, and the second, third, and fourth sentences are deleted at once. Of course, the second, third and fourth sentences may be deleted multiple times, which is not particularly limited in this embodiment.

Suppose that the seventh and eighth sentences are deleted first, and then the second, third and fourth sentences deleted later. At this time, the user feels that the context continuity is bad after the fourth sentence is deleted, resulting in some abrupt video editing, which is wanted to withdraw the resume. With the technical solution provided in this embodiment, as shown in fig. 2b, the user may click on the clip operation identifier corresponding to the fourth sentence, so as to delete the fourth sentenceThe recovery is performed and the recovery is performed, as shown in fig. 2 c. In FIG. 2c, the clip operation corresponding to the fourth sentence is identified from the second icon (e.g) Switch display as the first icon (e.g +.>). This is perceivable from the interface by the user, and not directly perceivable is: after the user performs the operation, the deleted fourth sentence is restored back to the video to be clipped.

Fig. 3 is a schematic flow chart of a video editing method according to another embodiment of the present application. Similarly, the execution subject of the method provided in this embodiment may be a client device, such as a mobile phone, a desktop computer, a notebook computer, a tablet computer, an intelligent wearable device, etc., which is not limited in this embodiment. The client device is provided with a computer program product (such as a clipping application) capable of realizing the following method steps, or a functional module added in the existing application capable of realizing the following method steps, etc. Specifically, as shown in fig. 3, the method includes:

201. Displaying a plurality of deleted sentences and recovery marks corresponding to the deleted sentences respectively; the audio text synchronized with the video to be clipped contains the plurality of pruned sentences, and the plurality of pruned sentences are respectively associated with corresponding video clips in the video to be clipped.

202. And responding to the operation triggered by the recovery mark corresponding to any target sentence in the plurality of deleted sentences by the user, and recovering the video clips associated with the target sentences.

The displaying of the plurality of pruned sentences in 201 above may include:

sequentially displaying the plurality of pruned sentences in a time sequence of user pruned operations; or alternatively

And sequentially displaying the plurality of deleted sentences according to the playing sequence of the video to be clipped.

Further, in an achievable technical solution, the method provided in this embodiment may further include the following steps:

203. displaying a clipping interface; the editing interface comprises a video playing area and a sentence displaying area;

204. playing the video to be clipped in the video playing area;

205. displaying a plurality of un-deleted sentences in the sentence display area; wherein, the plurality of non-pruned sentences comprise sentences which are adapted to the playing progress of the video to be clipped;

206. And if at least one deleted sentence exists after the un-deleted sentences are determined according to the play sequence of the video to be clipped, displaying prompt information at the corresponding position of the un-deleted sentences so as to prompt the user that the deleted sentences exist after the user.

Based on the above-mentioned scheme, the displaying of the plurality of pruned sentences and the recovery identifications corresponding to the plurality of pruned sentences in step 101 "in this embodiment may include:

1011. responding to the operation of the user on the prompt information, and acquiring the multiple deleted sentences after the prompt information corresponds to the un-deleted sentences;

1012. displaying the deleted sentences and the recovery identifications corresponding to the deleted sentences respectively in a popup window; or shifting down the sentences shown after the non-pruned sentences to leave a space, and showing the pruned sentences and the recovery marks corresponding to the pruned sentences in the space.

As shown in FIG. 4a, the first sentence is followed by the deletion of three sentences, which may be displayed on the interface similarly toIs a prompt for a message. After clicking the prompt information, the user may present an interface as shown in fig. 4b, for example, displaying multiple deleted sentences and recovery marks corresponding to the multiple deleted sentences respectively in the popup window, for example +. >The drawings in the specification do not show another way, namely, a way of "shifting down the sentences shown after the non-pruned sentences to leave a space, and showing the multiple pruned sentences and the recovery marks corresponding to the multiple pruned sentences in the space. The resulting display effect in this way is similar to that of fig. 2b described above.

In the technical solution provided in this embodiment, the sentence display area only displays the undeleted sentences, and displays, for example, at the positions of the undeleted sentencesThe prompt information is that the user knows that the deleted sentences exist at the prompt information, and the user can see the deleted sentences by clicking. The method has the advantages that more undeleted sentences are displayed in a limited sentence display area, the display is complete and coherent, and the user does not need to always find up and down, so that the clipping personnel can find out the places where the clips are not right.

Fig. 5 shows a flowchart of a video editing method according to another embodiment of the present application. As shown in fig. 5, the method provided in this embodiment includes:

301. based on the clip operation attribute to which the plurality of sentences respectively correspond, determining display modes corresponding to the multiple sentences respectively; the audio text synchronized with the video to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding video clips in the video to be clipped, and the clipping operation attribute represents the clipping operation type of a user;

302. And responding to a restoration operation triggered after the user selects a deleted sentence according to the display mode, restoring the video fragment associated with the deleted sentence, and adjusting the clipping operation attribute corresponding to the deleted sentence to change the display mode.

The method provided in this embodiment is that the "clipping operation identifier" mentioned above is a scheme of one attribute corresponding to no specific form.

The clipping operation attribute in this embodiment may include: a first attribute indicating that the user has not pruned and prompting the user to prune and a second attribute indicating that the user has pruned and prompting the user to recover at any time. Different display modes can be corresponding to different attributes. For example, the clipping operation identifier has two presentation properties, namely: a first attribute indicating that the user has not pruned and prompting the user to prune and a second attribute indicating that the user has pruned and prompting the user to recover at any time. The display modes corresponding to the attributes are different, so that sentences corresponding to the different attributes are displayed differently. For example, a sentence corresponding to the first attribute is normally displayed; and displaying gray scale corresponding to the sentence of the second attribute. Or, the sentence corresponding to the first attribute is normally displayed; sentences corresponding to the second attribute have strikethroughs displayed thereon, such as "hello". Or, the sentence corresponding to the first attribute is normally displayed; a sentence corresponding to the second attribute is displayed in a flashing manner; etc.

For example, as shown in fig. 6a, the user selects the second sentence, and then clicks the right button; displaying an operation selection box as shown in fig. 6b on the editing interface, wherein the user can recover the sentence and recover the video clip associated with the sentence by clicking a 'recover' option in the operation selection box; the operation herein may also be canceled by clicking on the "cancel" option in the operations box.

For example, in one implementation solution, the clipping operation attribute in this embodiment may include: deleted properties to be restored, undeleted deletable properties, and the like, which are not particularly limited in this embodiment.

Fig. 7 shows a flowchart of an audio clipping method according to an embodiment of the present application. As shown in the drawing, the liquid crystal display device, the method comprises the following steps:

401. and displaying the clipping operation identifiers corresponding to the multiple sentences respectively.

The audio text synchronized with the audio to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding audio fragments in the audio to be clipped, and the clipping operation identification represents the clipping operation type of the user and is used for assisting the user to randomly delete or restore the audio fragments associated with the sentences.

402. In response to a restoration operation triggered by a user selecting a pruned sentence with reference to a clipping operation identification, restoring an audio clip associated with the pruned sentence.

Further, the clipping operation identifier has two kinds of display icons, which are respectively: a first icon indicating that the user has not pruned and prompting the user to make a pruning and a second icon indicating that the user has pruned and prompting the user to resume at any time. Correspondingly, the method provided by the embodiment may further include the following steps:

403. responding to a recovery operation triggered after a user selects a deleted sentence with reference to a clip operation identifier, and displaying the clip operation identifier corresponding to the deleted sentence as the first icon by switching the second icon;

404. and deleting the audio fragment associated with the non-pruned sentence in response to a deleting operation for the non-pruned sentence triggered by the user based on the clip operation identification, and displaying the clip operation identification corresponding to the non-pruned sentence as the second icon by switching the first icon.

The clip object in this embodiment is different from the above embodiments, and the clip object in at least some of the steps in the above embodiments may be replaced with the audio to be clipped in this embodiment. That is, this embodiment may include at least some of the steps in each of the above embodiments in addition to the steps described above. For example, the video playing area mentioned in the above embodiment may also have a video playing area, where no content may be displayed, only one playing identifier may be displayed, or one picture may be displayed, which is not limited in this embodiment.

The execution main body of the technical scheme provided by the embodiments of the application can be a clipping software or a newly added functional module on the existing application. For example, software having functions corresponding to the technical solutions provided in the embodiments of the present application is installed on one client device. As shown in fig. 8, a user may import a video to be clipped, and the execution subject of the method provided in the embodiments of the present application may perform speech recognition and/or subtitle recognition on audio synchronized with the video to be clipped to obtain the audio text (i.e., subtitle); then, displaying sentences in the audio text; the user can perform the operations as shown in fig. 8 for each sentence: editing operations (deleting unwanted sentences in a text editing manner and deleting video clips associated with the unwanted sentences), editing sentences in audio text, adding effects operations (such as adding text styles, background sounds, stickers, etc.), and the like. As can be seen from fig. 8, in the solution provided in the embodiment of the present application, the deletion of the sentence and the clipping operation of the video clip are separated from the other two types of operations in the embodiment; even if the operations are mixed together, the user can accurately withdraw when the user wants to withdraw a certain operation.

In summary, the embodiment of the application provides a scheme for unordered withdrawal, and a user can flexibly recover deleted contents through the scheme, so that flexibility and efficiency of video editing of the user are improved. Can be embodied in the following points:

1. the operation efficiency is high: after the user deletes the sentence for multiple times, if the user wants to recover the sentence deleted for a certain time, the user can directly recover the sentence.

2. Accurate recovery of deleted sentences: when unordered withdrawal is performed, a user can directly withdraw a certain deleting operation, and the deleting operation is separated from editing operations and other adding operations.

3. The adjustment space is large: the deleted content is still visible, and the user can flexibly resume and delete again and preview in time.

Fig. 9 shows a schematic structural diagram of a video editing apparatus according to an embodiment of the present application. As shown in fig. 9, the video clip apparatus includes: a display module 11 and a clipping module 12. The display module 11 is configured to display a plurality of sentences and clipping operation identifiers corresponding to the plurality of sentences respectively; the audio text synchronized with the video to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding video clips in the video to be clipped, and the clipping operation identification represents the clipping operation type of the user and is used for assisting the user to randomly delete or restore the video clips associated with the sentences. The clipping module 12 is configured to resume the video segments associated with a pruned sentence in response to a user referencing a clipping operation to identify a resume operation triggered after the pruned sentence has been selected.

Further, the clipping operation identifier has two kinds of display icons, which are respectively: a first icon indicating that the user has not pruned and prompting the user to make a pruning and a second icon indicating that the user has pruned and prompting the user to resume at any time. Correspondingly, the display module 11 in the device provided in this embodiment is further configured to:

Further, when displaying a plurality of sentences and the clipping operation identifiers corresponding to the sentences, the display module 11 is specifically configured to:

sequentially displaying the sentences according to the playing sequence of the video to be clipped;

and respectively displaying the corresponding clipping operation identifiers on the corresponding rows of the plurality of sentences.

Further, the display module 11 is further configured to display a clipping interface; the editing interface comprises a video playing area and a sentence displaying area for displaying the sentences; and playing the video to be clipped in the video playing area. Correspondingly, the video clipping apparatus provided in this embodiment may further include a determining module. The determining module is used for determining the sentences based on the playing progress of the video to be clipped.

Further, the display module 11 is specifically configured to, when sequentially displaying the plurality of sentences according to the playing order of the video to be clipped:

Further, the determining module in the video clipping apparatus provided in this embodiment may be further configured to: determining a sliding track in response to a sliding operation of a user; and determining a plurality of sentences to be operated according to the sliding track. Correspondingly, the display module 11 is further configured to switch and display clip operation identifiers corresponding to the multiple to-be-operated sentences respectively, so that the first icon is switched to the second icon, or the second icon is switched to the first icon. The clipping module 12 is further configured to: and respectively performing clipping operation on the video clips related to the multiple sentences to be operated so as to delete or restore the corresponding video clips.

Further, the video editing apparatus provided in this embodiment may further include an identification module and an editing module. The recognition module is used for carrying out voice recognition and/or subtitle recognition on the audio synchronized with the video to be clipped so as to obtain the audio text. The editing module is used for responding to the editing operation of a user for one sentence to be modified in the displayed multiple sentences, and modifying the sentence to be modified based on the editing content of the user; wherein the modification to the statement includes at least one of: text content modification, text style modification, sentence breaking modification.

What needs to be explained here is: the video editing device provided in the foregoing embodiments may implement the technical solutions described in the foregoing method embodiments, and the specific implementation principles of the foregoing modules or units may refer to corresponding contents in the foregoing method embodiments, which are not repeated herein.

Another embodiment of the present application provides a video editing apparatus. The construction of the video clip apparatus is similar to the embodiment shown in fig. 9 described above. Specifically, the video clipping apparatus includes: and the display module and the clipping module. The display module is used for displaying a plurality of deleted sentences and recovery marks corresponding to the deleted sentences respectively; the audio text synchronized with the video to be clipped contains the plurality of pruned sentences, and the plurality of pruned sentences are respectively associated with corresponding video clips in the video to be clipped. And the clipping module is used for responding to the operation triggered by the recovery mark corresponding to any target sentence in the plurality of deleted sentences by a user, and recovering the video clips associated with the target sentences.

Further, the display module is specifically configured to, when displaying the plurality of pruned sentences:

Further, the display module is further configured to:

displaying a clipping interface; the editing interface comprises a video playing area and a sentence displaying area;

playing the video to be clipped in the video playing area;

in the sentence display area, showing a plurality of deleting the sentence; wherein, the plurality of non-pruned sentences comprise sentences which are adapted to the playing progress of the video to be clipped;

and if at least one deleted sentence exists after the un-deleted sentences are determined according to the play sequence of the video to be clipped, displaying prompt information at the corresponding position of the un-deleted sentences so as to prompt the user that the deleted sentences exist after the user.

Further, when displaying the plurality of pruned sentences and the recovery identifications corresponding to the pruned sentences, the display module is specifically configured to:

responding to the operation of the user on the prompt information, and acquiring the multiple deleted sentences after the prompt information corresponds to the un-deleted sentences;

displaying the deleted sentences and the recovery identifications corresponding to the deleted sentences respectively in a popup window; or shifting down the sentences shown after the non-pruned sentences to leave a space, and showing the pruned sentences and the recovery marks corresponding to the pruned sentences in the space.

Yet another embodiment of the present application provides a video editing apparatus. The construction of the video clip apparatus is similar to the embodiment shown in fig. 9 described above. In particular, the method comprises the steps of, the video editing device comprises: and the display module and the clipping module. The display module is used for determining display modes corresponding to the sentences respectively based on clipping operation attributes corresponding to the sentences respectively; the audio text synchronized with the video to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding video clips in the video to be clipped, and the clipping operation attribute represents the clipping operation type of a user. The clipping module is used for responding to a restoration operation triggered after a user selects a deleted sentence according to a display mode, restoring the video fragment associated with the deleted sentence, and adjusting clipping operation attributes corresponding to the deleted sentence to change the display mode.

What needs to be explained here is: the video clipping apparatus provided in the above embodiments can implement the technical solutions described in the above method embodiments, the specific implementation principle of each module or unit may refer to the corresponding content in each method embodiment, and will not be repeated herein.

Another embodiment of the present application provides an audio editing apparatus. The structure of the audio clipping apparatus is similar to the embodiment shown in fig. 9 described above. Specifically, the audio clipping apparatus includes: and the display module and the clipping module. The display module is used for displaying a plurality of sentences and clipping operation identifiers corresponding to the sentences respectively; the audio text synchronized with the audio to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding audio fragments in the audio to be clipped, and the clipping operation identification represents the clipping operation type of the user and is used for assisting the user to randomly delete or restore the audio fragments associated with the sentences. The clipping module is used for responding to a restoration operation triggered by the user after selecting a deleted sentence according to the clipping operation identification, and restoring the audio fragment associated with the deleted sentence.

Further, the clipping operation identifier has two kinds of display icons, which are respectively: a first icon indicating that the user has not pruned and prompting the user to make a pruning and a second icon indicating that the user has pruned and prompting the user to resume at any time. Correspondingly, the display module is further used for:

Responding to a recovery operation triggered after a user selects a deleted sentence with reference to a clip operation identifier, and displaying the clip operation identifier corresponding to the deleted sentence as the first icon by switching the second icon;

responsive to a user identifying a delete operation for an unauthenticated sentence triggered based on a clipping operation, deleting an audio clip associated with the unauthenticated sentence, and switching and displaying the clip operation identifier corresponding to the undeleted sentence as the second icon by the first icon.

What needs to be explained here is: the audio editing apparatus provided in the foregoing embodiments may implement the technical solutions described in the foregoing method embodiments, and the specific implementation principles of the foregoing modules or units may refer to corresponding contents in the foregoing method embodiments, which are not repeated herein.

Fig. 10 shows a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device comprises a processor 32 and a memory 31. Wherein the memory 31 is configured to store one or more computer instructions; the processor 32 is coupled to the memory 31 for one or more computer instructions (e.g., computer instructions implementing data storage logic) for implementing the steps of the video editing method embodiments described above, or the steps of the audio editing method embodiments described above.

The memory 31 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

Further, as shown in fig. 10, the electronic device further includes: communication component 33, power component 35, display 34, audio component 36, and other components. Only some of the components are schematically shown in fig. 10, which does not mean that the electronic device only comprises the components shown in fig. 10.

Yet another embodiment of the present application provides a computer program product (not shown in the drawings of the specification). The computer program product comprises a computer program or instructions which, when executed by a processor, cause the processor to carry out the steps of the method embodiments described above.

Accordingly, embodiments of the present application also provide a computer-readable storage medium storing a computer program capable of implementing the method steps or functions provided by the above embodiments when executed by a computer.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims

1. A video editing method, comprising:

displaying a plurality of sentences and clipping operation identifiers corresponding to the sentences respectively; the audio text synchronized with the video to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding video clips in the video to be clipped, and the clipping operation identification represents the clipping operation type of the user and is used for assisting the user to randomly delete or restore the video clips associated with the sentences; the plurality of sentences comprise a plurality of deleted sentences, and the clipping operation marks corresponding to the deleted sentences are respectively used as recovery marks;

and restoring the video segment associated with the selected pruned sentence in response to a restoration operation triggered by a restoration identifier corresponding to the selected pruned sentence after the user selects the pruned sentence with reference to the clipping operation identifier.

2. The method of claim 1, wherein the step of determining the position of the substrate comprises, the clipping operation identifier is provided with two display icons, namely: a first icon indicating that the user has not pruned and indicating that the user is able to prune and a second icon indicating that the user has pruned and indicating that the user is able to recover at any time; and

the method further comprises the steps of:

3. The method of claim 1, wherein displaying a plurality of sentences and the respective corresponding clip operation identifications of the plurality of sentences comprises:

4. A method according to claim 3, further comprising:

displaying a clipping interface; the editing interface comprises a video playing area and a sentence displaying area for displaying the sentences;

playing the video to be clipped in the video playing area;

And determining the sentences based on the playing progress of the video to be clipped.

5. The method of claim 4, wherein sequentially displaying the plurality of sentences in the order of play of the video to be edited comprises:

6. The method according to any one of claims 2 to 5, further comprising:

determining a sliding track in response to a sliding operation of a user;

determining a plurality of sentences to be operated according to the sliding track;

switching and displaying the clipping operation identifiers corresponding to the multiple sentences to be operated respectively so as to switch from a first icon to a second icon or from the second icon to the first icon;

and respectively performing clipping operation on the video clips related to the multiple sentences to be operated so as to delete or restore the corresponding video clips.

7. The method according to any one of claims 1 to 5, further comprising:

performing voice recognition and/or subtitle recognition on the audio synchronized with the video to be clipped to obtain the audio text;

Responding to the editing operation of a user for one sentence to be modified in the displayed multiple sentences, and modifying the sentence to be modified based on the editing content of the user;

8. A video editing method, comprising:

9. The method of claim 8, wherein displaying the plurality of pruned sentences comprises:

10. The method as recited in claim 8, further comprising:

playing the video to be clipped in the video playing area;

displaying a plurality of un-deleted sentences in the sentence display area; wherein, the plurality of non-pruned sentences comprise sentences which are adapted to the playing progress of the video to be clipped;

11. The method of claim 10, wherein displaying a plurality of pruned sentences and corresponding recovery identifications of the plurality of pruned sentences, respectively, comprises:

12. A video editing method, comprising:

determining display modes corresponding to the sentences respectively based on clipping operation attributes corresponding to the sentences respectively; the audio text synchronized with the video to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding video clips in the video to be clipped, and the clipping operation attribute represents the clipping operation type of a user; the sentences comprise a plurality of deleted sentences, and the clip operation attribute corresponding to each deleted sentence represents that deleted sentences can be recovered at any time;

and responding to a restoration operation triggered by clicking an operation frame displayed for the selected deleted sentence after the user selects the deleted sentence with reference to the display mode, restoring the video fragment associated with the deleted sentence, and adjusting the clipping operation attribute corresponding to the deleted sentence to change the display mode.

13. An audio editing method, comprising:

displaying a plurality of sentences and clipping operation identifiers corresponding to the sentences respectively; the audio text synchronized with the audio to be clipped comprises a plurality of sentences, the sentences are respectively associated with corresponding audio fragments in the audio to be clipped, and the clipping operation identification represents the clipping operation type of the user and is used for assisting the user to randomly delete or restore the audio fragments associated with the sentences; the plurality of sentences comprise a plurality of deleted sentences, and the clipping operation marks corresponding to the deleted sentences are respectively used as recovery marks;

And restoring the audio fragment associated with the deleted sentence in response to a restoration operation triggered by a restoration identifier corresponding to the selected deleted sentence after the deleted sentence is selected by a user referring to the clipping operation identifier.

14. An electronic device, comprising a memory and a processor; wherein,

the memory stores one or more computer instructions;

the processor being coupled to the memory for executing the one or more computer instructions for implementing the steps of the method of any of the preceding claims 1 to 7, or the steps of the method of any of the preceding claims 8 to 11, or the steps of the method of any of the preceding claims 12 or 13.