CN114900713B

CN114900713B - Video clip processing method and system

Info

Publication number: CN114900713B
Application number: CN202210817931.5A
Authority: CN
Inventors: 朱立平; 黄琛
Original assignee: Shenzhen Biti Education Technology Co ltd
Current assignee: Shenzhen Biti Education Technology Co ltd
Priority date: 2022-07-13
Filing date: 2022-07-13
Publication date: 2022-09-30
Anticipated expiration: 2042-07-13
Also published as: CN114900713A

Abstract

The application discloses a video clip processing method and a system, wherein the method comprises the following steps: acquiring the duration and the file size of a first video file to be edited; opening up two cache spaces in the memory according to the size of the first video file; placing the first video file into a first cache space; acquiring a plurality of video sections to be reserved in the first video file; acquiring the starting time and the ending time of each video paragraph to be reserved, and copying each video paragraph to be reserved into a second cache space according to the starting time and the ending time in sequence; and splicing the video paragraphs copied to the second cache space to obtain the second video file. The method and the device solve the problem that in the prior art, the efficiency is low when video splicing processing is carried out after the video paragraph is deleted in the video clip, so that the efficiency of the video clip is improved, and the time of the video clip is shortened.

Description

Video clip processing method and system

Technical Field

The present application relates to the field of video processing, and in particular, to a method and system for processing a video clip.

Background

Self media is a generic term for new media that delivers normative and non-normative information to an unspecified large majority of individuals or to an unspecified number of individuals through modern, electronic means.

Most of self-media are operated by individuals, videos are shot through the individuals, then the videos are clipped and combined, wherein video clipping refers to a process of remixing added materials such as pictures, background music, special effects and scenes with the videos, cutting and combining video sources, and generating new videos with different expressive forces through secondary editing. Most of self-media authors are not professionals, a lot of time is needed from shooting to film production, especially for individuals who take the self-media operation as part-time or hobby, many short video films are often collected during shooting, and situations such as word forgetting and picture pause often occur in the shooting process, so that a lot of unnecessary video clips exist in a shot video file.

When video editing is performed, unnecessary video segments need to be deleted, and then the reserved video segments are spliced to obtain an edited video file.

When the video file is edited, the video file stored on the hard disk is generally directly operated, and the operation processing mode needs low efficiency and needs long editing time.

Disclosure of Invention

The embodiment of the application provides a video clip processing method and a video clip processing system, which are used for at least solving the problem that in the prior art, the efficiency is low when video splicing processing is carried out after a video paragraph is deleted in a video clip.

According to an aspect of the present application, there is provided a video clip processing method including: acquiring the duration and the file size of a first video file to be edited; opening up two cache spaces in an internal memory according to the size of the first video file, wherein the two cache spaces comprise a first cache space and a second cache space, the first cache space and the second cache space have the same size, and the capacity of the first cache space and the second cache space is larger than the size of the first video file; placing the first video file into a first cache space; acquiring a plurality of to-be-reserved video paragraphs in the first video file, wherein the plurality of to-be-reserved video paragraphs are used for being spliced into a second video file, and the second video file is a clipped video file; acquiring the starting time and the ending time of each video section to be reserved, and copying each video section to be reserved into a second cache space according to the starting time and the ending time in sequence; and splicing the video paragraphs copied to the second cache space to obtain the second video file.

Further, the obtaining of the plurality of to-be-reserved video segments in the first video file includes: displaying the first video file on a clipping interface according to a time axis; receiving a plurality of groups of clipping time points input by a user, wherein each group of clipping time points comprises a clipping starting time and a clipping ending time; and taking the video paragraphs in each group of clipping time point intervals as video paragraphs to be reserved, wherein the starting time of the time paragraphs to be reserved is the clipping starting time, and the ending time of the time paragraphs to be reserved is the clipping ending time.

Further, before receiving the plurality of sets of clip time points input by the user, the method further includes: judging whether an abnormal segment exists in the first video file, wherein the abnormal segment is a part with abnormal sound in the first video file; and if the abnormal segments exist, displaying the starting time points and the ending time points of the abnormal segments on the clipping interface, wherein the starting time points and the ending time points of the abnormal segments are used as bases for the user to input the plurality of groups of clipping time points.

Further, the determining whether an abnormal segment exists in the first video file comprises: extracting an audio track in the first video file to obtain an audio corresponding to the first video file; performing character recognition on the audio to obtain a text file of the audio, and performing decibel recognition on the audio according to a time axis to obtain a corresponding relation between decibel size and the time axis; and determining whether abnormal fragments exist according to the corresponding relation between the text file or the decibel size and the time axis.

Further, determining whether an abnormal clip exists according to the text file and the corresponding relationship between the decibel size and the time axis includes: searching whether a first preset word and a second preset word appear in the text, and taking a video segment between the time point of the first preset word and the time point of the second preset word as the abnormal segment under the condition that the first preset word and the second preset word appear; and searching a preset part with sound lower than a preset decibel value and duration longer than a preset time according to the corresponding relation, and taking the preset part as the abnormal segment under the condition that the preset part is searched.

According to another aspect of the present application, there is also provided a video clip processing system including: the device comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring the duration and the file size of a first video file to be edited; the device comprises a tunneling module and a video processing module, wherein the tunneling module is used for tunneling two cache spaces in a memory according to the size of the first video file, the two cache spaces comprise a first cache space and a second cache space, the first cache space and the second cache space have the same size, and the capacities of the first cache space and the second cache space are larger than the size of the first video file; the putting module is used for putting the first video file into a first cache space; a second obtaining module, configured to obtain multiple to-be-reserved video segments in the first video file, where the multiple to-be-reserved video segments are used to be spliced into a second video file, and the second video file is a video file that is finished by being clipped; the copying module is used for acquiring the starting time and the ending time of each video section to be reserved and copying each video section to be reserved into a second cache space according to the starting time and the ending time in sequence; and the splicing module is used for splicing the video paragraphs copied to the second cache space to obtain the second video file.

Further, the second obtaining module is configured to: displaying the first video file on a clipping interface according to a time axis; receiving a plurality of groups of clipping time points input by a user, wherein each group of clipping time points comprises a clipping starting time and a clipping ending time; and taking the video paragraphs in each group of clipping time point intervals as video paragraphs to be reserved, wherein the starting time of the time paragraphs to be reserved is the clipping starting time, and the ending time of the time paragraphs to be reserved is the clipping ending time.

Further, the system further comprises: the judging module is used for judging whether an abnormal segment exists in the first video file, wherein the abnormal segment is a part with abnormal sound in the first video file; and if the abnormal segments exist, displaying the starting time points and the ending time points of the abnormal segments on the clipping interface, wherein the starting time points and the ending time points of the abnormal segments are used as bases for the user to input the plurality of groups of clipping time points.

Further, the determining module is configured to: extracting an audio track in the first video file to obtain an audio corresponding to the first video file; performing character recognition on the audio to obtain a text file of the audio, and performing decibel recognition on the audio according to a time axis to obtain a corresponding relation between decibel size and the time axis; and determining whether abnormal clips exist according to the corresponding relation between the text file or the decibel size and the time axis.

Further, the determining module is configured to: searching whether a first preset word and a second preset word appear in the text, and taking a video segment between the time point of the first preset word and the time point of the second preset word as the abnormal segment under the condition that the first preset word and the second preset word appear; and searching a preset part with sound lower than a preset decibel value and duration longer than a preset time according to the corresponding relation, and taking the preset part as the abnormal segment under the condition that the preset part is searched.

In the embodiment of the application, the duration and the file size of the first video file to be edited are obtained; opening up two cache spaces in a memory according to the size of the first video file, wherein the two cache spaces comprise a first cache space and a second cache space, the first cache space and the second cache space are the same in size, and the capacity of the first cache space and the capacity of the second cache space are larger than the size of the first video file; putting the first video file into a first cache space; acquiring a plurality of to-be-reserved video paragraphs in the first video file, wherein the plurality of to-be-reserved video paragraphs are used for being spliced into a second video file, and the second video file is a clipped video file; acquiring the starting time and the ending time of each video paragraph to be reserved, and copying each video paragraph to be reserved into a second cache space according to the starting time and the ending time in sequence; and splicing the video paragraphs copied to the second cache space to obtain the second video file. The method and the device solve the problem that in the prior art, the efficiency is low when video splicing processing is carried out after the video paragraph is deleted in the video clip, so that the efficiency of the video clip is improved, and the time of the video clip is shortened.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, are included to provide a further understanding of the application, and the description of the exemplary embodiments of the application are intended to be illustrative of the application and are not intended to limit the application. In the drawings:

fig. 1 is a flow chart of a video clip processing method according to an embodiment of the present application.

Detailed Description

It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.

In the present embodiment, a video clip processing method is provided, and fig. 1 is a flowchart of a video clip processing method according to an embodiment of the present application, and as shown in fig. 1, the flowchart includes the following steps:

step S102, the duration and the file size of a first video file to be clipped are obtained.

Step S104, opening up two cache spaces in a memory according to the size of the first video file, wherein the two cache spaces comprise a first cache space and a second cache space, the first cache space and the second cache space have the same size, and the capacities of the first cache space and the second cache space are larger than the size of the first video file.

And step S106, placing the first video file into a first cache space.

Step S108, a plurality of video sections to be reserved in the first video file are obtained, wherein the plurality of video sections to be reserved are used for being spliced into a second video file, and the second video file is a video file which is finished by being clipped.

Step S110, obtaining the starting time and the ending time of each video section to be reserved, and copying each video section to be reserved into a second cache space according to the starting time and the ending time in sequence.

As an embodiment that can be added, in this step, after obtaining the start time and the end time of each to-be-retained video segment, a first memory index corresponding to the start time and a second memory index corresponding to the end time of each to-be-retained video segment are searched in a first cache space, where the first memory index and the second memory index are used to indicate a position in the first cache space, the first cache space is a continuous cache space, the to-be-retained video segment in the first cache space is searched according to the first memory index and the second memory index, and the searched to-be-retained video segment is copied to the second cache space.

Optionally, when the first cache space and the second cache space are opened up, determining whether a remaining space in the memory is larger than a sum of the first cache space and the second cache space, if so, searching a plurality of continuous free spaces in the memory, and opening up the continuous free spaces as the first cache space preferentially from a maximum continuous free space; and after the first cache space is opened, opening up the second cache space in other spaces left in the memory. Through the optional implementation mode, the first cache space can be ensured to use continuous memory space as much as possible, and therefore efficiency is further improved.

When the remaining space in the memory is larger than the first cache space and smaller than the sum of the first cache space and the second cache space, opening up the first cache space in the memory without opening up the second cache space; after the first cache space is opened up, opening up a third cache space, wherein the third cache space is used for storing a first memory index and a second memory index of each video segment to be reserved in a memory according to the sequence of each video segment to be reserved; when the user performs splicing editing, correspondingly adjusting all memory indexes stored in the third cache space according to the operation of the user on each video section to be reserved; and when the user clicks to finish splicing editing and starts to generate a video, reading all the memory indexes from the third cache space, selecting the reserved video sections and the sequence in the first cache space according to the read memory indexes, and deleting the video sections which are not reserved from the first cache space.

And under the condition that the residual space in the memory is smaller than the first cache space, storing the video file to be edited in a hard disk for editing, wherein the editing mode is the same as that in the prior art, and is not repeated herein.

And step S112, splicing the video paragraphs copied to the second cache space to obtain the second video file.

As an optional implementation manner, after the video paragraphs copied to the second cache space are spliced to obtain the second video file, copying and storing the first video file stored in the first cache space to a hard disk, after the video paragraphs copied to the hard disk being saved, emptying the first cache space, and storing the spliced second video file in the first cache space, in this case, the video paragraphs used for splicing the second video file are stored in the second cache space, and the first cache space is used for storing the second video file; receiving a browsing request from a user, wherein the browsing request is used for requesting to play and preview the second video file; and acquiring the second video file from the first cache space according to the browsing request, and playing the second video file.

In the above steps, the processing of the first video file is distributed in two different memory cache spaces, and the efficiency of video editing can be greatly improved by operating in the memory, so that the problem of low efficiency in video splicing processing after deleting a video paragraph in the video clip in the prior art is solved through the above steps, the efficiency of video clipping is improved, and the time of video clipping is shortened.

In this embodiment, the method can be matched with video clip software to display the first video file on a clip interface according to a time axis; receiving a plurality of groups of clipping time points input by a user, wherein each group of clipping time points comprises a clipping starting time and a clipping ending time; and taking the video paragraphs in each group of the clipping time point intervals as video paragraphs to be reserved, wherein the starting time of the time paragraphs to be reserved is the clipping starting time, and the ending time of the time paragraphs to be reserved is the clipping ending time.

As a relatively intelligent processing manner, before receiving the plurality of groups of clipping time points input by the user, the method further includes: judging whether an abnormal segment exists in the first video file, wherein the abnormal segment is a part with abnormal sound in the first video file; and if the abnormal segment exists, displaying a starting time point and an ending time point of the abnormal segment on the clipping interface, wherein the starting time point and the ending time point of the abnormal segment are used as the basis for the user to input the plurality of groups of clipping time points.

There are many ways to determine the abnormal segment, for example, determining whether the abnormal segment exists in the first video file includes: extracting an audio track in the first video file to obtain an audio corresponding to the first video file; performing character recognition on the audio to obtain a text file of the audio, and performing decibel recognition on the audio according to a time axis to obtain a corresponding relation between decibel size and the time axis; and determining whether abnormal fragments exist according to the corresponding relation between the text file or the decibel size and the time axis.

Optionally, determining whether an abnormal clip exists according to the text file and the corresponding relationship between the decibel size and the time axis includes: searching whether a first preset word and a second preset word appear in the text, and taking a video segment between the time point of the first preset word and the time point of the second preset word as the abnormal segment under the condition that the first preset word and the second preset word appear; and searching a preset part with sound lower than a preset decibel value and duration longer than a preset time according to the corresponding relation, and taking the preset part as the abnormal segment under the condition that the preset part is searched.

In the above steps, abnormal segments of the video can be searched, and abnormal segments in the video material are identified by analyzing the obtained video material to be processed (i.e. the first video file, or the video file to be processed or clipped); cutting off the abnormal fragments from the video material, and storing the cut-off abnormal fragments into an abnormal collection; and finally numbering the pre-processing videos obtained by shearing, and storing the pre-processing videos into a pre-processing aggregation. This will be explained below.

There are various ways to identify abnormal segments in video material, for example, analyzing the video material to be processed, and identifying abnormal segments in video material specifically includes: analyzing the video material to be processed, positioning a segment with the volume lower than the detection decibel value, and marking the segment as a mute segment; identifying specific recorded voice and specific ending voice in the video material, marking a segment before the specific recorded voice as a preparation segment, and marking a segment after the specific ending voice as a field ending segment; and positioning the interrupted voice in the video material, and marking the interrupted voice as a position to be processed.

Processing can be performed according to whether silence exists, for example, a silence segment in a video material is cut off to obtain a primary processed video; cutting the preparation fragment and the closing fragment in the primary processing video to obtain a secondary processing video; marking the position where the voice is interrupted in the secondary processing video, and numbering the video to obtain a pre-processing video; and storing the pre-processing video into a pre-processing aggregation according to the numbering sequence.

The analyzing the video material to be processed, locating a segment whose volume is lower than a detection decibel value, and marking the segment as a silence segment specifically includes: extracting audio information in a video material to be processed; detecting a decibel value of the audio information; positioning segments of which the decibel value is lower than the detection decibel value in the audio information, marking the segments as silent segments, and numbering the silent segments; the number of the silent segment is associated with the number of the video material where the silent segment is located.

Specific voices can also be processed, for example, the identifying of the specific recorded voice and the specific ending voice in the video material, the marking of the segment before the specific recorded voice as a preparation segment, and the marking of the segment after the specific ending voice as a closing segment specifically include: identifying specific recorded voice and specific ending voice in the video material; positioning the position where the specific recorded voice ends and the position where the specific ending voice starts, and marking the positions as a starting mark and an ending mark; and sequentially grouping the start mark and the end mark in the video material pairwise from the first start mark to form a pair of cutting marks, and associating the same group of start mark and end mark.

Cutting the abnormal segments from the video material, and storing the cut abnormal segments into the abnormal collection specifically comprises: cutting off the mute sections from the video material, inserting mute section numbers into cutting positions, and marking the cut-off mute sections with the same mute section numbers; and storing the cut mute sections into a mute section subset in the abnormal collection.

Cutting the abnormal segments from the video material, and storing the cut abnormal segments into the abnormal collection specifically comprises: taking a pair of cutting marks as cutting points, cutting and reserving the video segments between the pair of cutting marks, and numbering the reserved video segments; extracting the content of the specific recorded voice, and using the content as the serial number of the reserved video segment; numbering preparation fragments and field receiving fragments before and after the reserved video segment, wherein the preparation fragments and the field receiving fragments are associated with the reserved video segment number; and respectively storing the preparation fragment and the field collection fragment into a preparation subset and a field collection subset in the abnormal collection.

The method for processing the abnormal video can further comprise the following steps: combining the pre-processing videos in the pre-processing combination set according to the serial number sequence; monitoring the size of a memory value after a plurality of pre-processing videos are combined; when the memory value of the combined pre-processing video exceeds the preset memory value, the combined pre-processing video is not combined after the current multiple pre-processing videos, and the combined pre-processing video is numbered again; sending the combined pre-processing video to a manual end for further processing to obtain a deep processing video; and receiving the deep processing videos processed by the manual end, arranging the deep processing videos according to the numbers, and sending the deep processing videos to the auditing end.

In this embodiment, an electronic device is provided, comprising a memory in which a computer program is stored and a processor configured to run the computer program to perform the method in the above embodiments.

The programs described above may be run on a processor or may also be stored in memory (or referred to as computer-readable media), which includes both non-transitory and non-transitory, removable and non-removable media, that implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

These computer programs may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks, and corresponding steps may be implemented by different modules.

Such an apparatus or system is provided in this embodiment. The system, referred to as a video clip processing system, comprises: the device comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring the duration and the file size of a first video file to be edited; the device comprises a tunneling module and a video processing module, wherein the tunneling module is used for tunneling two cache spaces in a memory according to the size of the first video file, the two cache spaces comprise a first cache space and a second cache space, the first cache space and the second cache space have the same size, and the capacities of the first cache space and the second cache space are larger than the size of the first video file; the putting module is used for putting the first video file into a first cache space; a second obtaining module, configured to obtain multiple to-be-reserved video segments in the first video file, where the multiple to-be-reserved video segments are used to be spliced into a second video file, and the second video file is a video file that is finished by being clipped; the copying module is used for acquiring the starting time and the ending time of each video section to be reserved and copying each video section to be reserved into a second cache space according to the starting time and the ending time in sequence; and the splicing module is used for splicing the video paragraphs copied to the second cache space to obtain the second video file.

The system or the apparatus is configured to implement the functions of the method in the foregoing embodiments, and each module in the system or the apparatus corresponds to each step in the method, which has been already described in the method, and is not described again here.

For example, the second obtaining module is configured to: displaying the first video file on a clipping interface according to a time axis; receiving a plurality of groups of clipping time points input by a user, wherein each group of clipping time points comprises a clipping starting time and a clipping ending time; and taking the video paragraphs in each group of the clipping time point intervals as video paragraphs to be reserved, wherein the starting time of the time paragraphs to be reserved is the clipping starting time, and the ending time of the time paragraphs to be reserved is the clipping ending time.

For another example, the system further comprises: the judging module is used for judging whether an abnormal segment exists in the first video file, wherein the abnormal segment is a part with abnormal sound in the first video file; and if the abnormal segment exists, displaying a starting time point and an ending time point of the abnormal segment on the clipping interface, wherein the starting time point and the ending time point of the abnormal segment are used as the basis for the user to input the plurality of groups of clipping time points.

For another example, the determining module is configured to: extracting an audio track in the first video file to obtain an audio corresponding to the first video file; performing character recognition on the audio to obtain a text file of the audio, and performing decibel recognition on the audio according to a time axis to obtain a corresponding relation between decibel size and the time axis; and determining whether abnormal fragments exist according to the corresponding relation between the text file or the decibel size and the time axis.

For another example, the determining module is configured to: searching whether a first preset word and a second preset word appear in the text, and taking a video segment between the time point of the first preset word and the time point of the second preset word as the abnormal segment under the condition that the first preset word and the second preset word appear; and searching a preset part with the sound lower than a preset decibel value and the duration time exceeding a preset time length according to the corresponding relation, and taking the preset part as the abnormal segment under the condition that the preset part is searched.

Optionally, the splicing module is further configured to, after the video paragraphs copied to the second cache space are spliced to obtain the second video file, copy and store the first video file stored in the first cache space to a hard disk, after the video paragraphs are copied to a hard disk in storage, empty the first cache space, and store the spliced second video file in the first cache space, where in this case, the video paragraphs used for splicing the second video file are stored in the second cache space, and the first cache space is used for storing the second video file; receiving a browsing request from a user, wherein the browsing request is used for requesting to play and preview the second video file; and acquiring the second video file from the first cache space according to the browsing request, and playing the second video file.

The embodiment solves the problem of low efficiency in video splicing processing after the video paragraph is deleted in the video editing in the prior art, thereby improving the efficiency of the video editing and shortening the time of the video editing.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method of video clip processing, comprising:

acquiring the duration and the file size of a first video file to be edited;

opening up two cache spaces in an internal memory according to the size of the first video file, wherein the two cache spaces comprise a first cache space and a second cache space, the first cache space and the second cache space have the same size, and the capacity of the first cache space and the second cache space is larger than the size of the first video file; when the first cache space and the second cache space are opened up, judging whether the space left in the memory is larger than the sum of the first cache space and the second cache space, if so, searching a plurality of continuous free spaces in the memory, and opening up the continuous free spaces as the first cache space from the largest continuous free space preferentially; after the first cache space is opened up, opening up a second cache space in other remaining spaces in the memory;

placing the first video file into a first cache space;

acquiring a plurality of to-be-reserved video paragraphs in the first video file, wherein the plurality of to-be-reserved video paragraphs are used for being spliced into a second video file, and the second video file is a clipped video file;

acquiring the starting time and the ending time of each video paragraph to be reserved, and copying each video paragraph to be reserved into a second cache space according to the starting time and the ending time in sequence; after the starting time and the ending time of each video section to be reserved are obtained, a first memory index corresponding to the starting time and a second memory index corresponding to the ending time of each video section to be reserved are searched in a first cache space, wherein the first memory index and the second memory index are used for indicating the position in the first cache space, the first cache space is a continuous cache space, the video section to be reserved in the first cache space is searched according to the first memory index and the second memory index, and the searched video section to be reserved is copied into a second cache space;

and splicing the video paragraphs copied to the second cache space to obtain the second video file.

2. The method of claim 1, wherein obtaining the plurality of video segments to be reserved in the first video file comprises:

displaying the first video file on a clipping interface according to a time axis;

receiving a plurality of groups of clipping time points input by a user, wherein each group of clipping time points comprises a clipping start time and a clipping end time;

and taking the video paragraphs in each group of clipping time point intervals as video paragraphs to be reserved, wherein the starting time of the time paragraphs to be reserved is the clipping starting time, and the ending time of the time paragraphs to be reserved is the clipping ending time.

3. The method of claim 2, wherein prior to receiving the plurality of sets of clipping time points of the user input, the method further comprises:

judging whether an abnormal segment exists in the first video file, wherein the abnormal segment is a part with abnormal sound in the first video file;

and if the abnormal segment exists, displaying a starting time point and an ending time point of the abnormal segment on the clipping interface, wherein the starting time point and the ending time point of the abnormal segment are used as the basis for the user to input the plurality of groups of clipping time points.

4. The method of claim 3, wherein determining whether an abnormal segment exists in the first video file comprises:

extracting an audio track in the first video file to obtain an audio corresponding to the first video file;

performing character recognition on the audio to obtain a text file of the audio, and performing decibel recognition on the audio according to a time axis to obtain a corresponding relation between decibel size and the time axis;

and determining whether abnormal fragments exist according to the corresponding relation between the text file or the decibel size and the time axis.

5. The method of claim 4, wherein determining whether abnormal snippets exist according to the text file and the correspondence between decibel size and time axis comprises:

searching whether a first preset word and a second preset word appear in the text, and taking a video segment between the time point of the first preset word and the time point of the second preset word as the abnormal segment under the condition that the first preset word and the second preset word appear;

and searching a preset part with the sound lower than a preset decibel value and the duration time exceeding a preset time length according to the corresponding relation, and taking the preset part as the abnormal segment under the condition that the preset part is searched.

6. A video clip processing system, comprising:

the first acquisition module is used for acquiring the duration and the file size of a first video file to be edited;

the device comprises a tunneling module and a video processing module, wherein the tunneling module is used for tunneling two cache spaces in a memory according to the size of the first video file, the two cache spaces comprise a first cache space and a second cache space, the first cache space and the second cache space have the same size, and the capacities of the first cache space and the second cache space are larger than the size of the first video file; when the first cache space and the second cache space are opened up, judging whether the space left in the memory is larger than the sum of the first cache space and the second cache space, if so, searching a plurality of continuous free spaces in the memory, and opening up the continuous free spaces as the first cache space from the largest continuous free space preferentially; after the first cache space is opened up, opening up the second cache space in other remaining spaces in the memory;

the putting module is used for putting the first video file into a first cache space;

a second obtaining module, configured to obtain multiple to-be-reserved video segments in the first video file, where the multiple to-be-reserved video segments are used to be spliced into a second video file, and the second video file is a video file that is finished by being clipped;

the copying module is used for acquiring the starting time and the ending time of each video section to be reserved and copying each video section to be reserved into a second cache space according to the starting time and the ending time in sequence; after obtaining the start time and the end time of each video segment to be reserved, searching a first memory index corresponding to the start time and a second memory index corresponding to the end time of each video segment to be reserved in a first cache space, wherein the first memory index and the second memory index are used for indicating positions in the first cache space, the first cache space is a continuous cache space, searching the video segment to be reserved in the first cache space according to the first memory index and the second memory index, and copying the searched video segment to be reserved to the second cache space;

and the splicing module is used for splicing the video paragraphs copied to the second cache space to obtain the second video file.

7. The system of claim 6, wherein the second obtaining module is configured to:

receiving a plurality of groups of clipping time points input by a user, wherein each group of clipping time points comprises a clipping starting time and a clipping ending time;

8. The system of claim 7, further comprising:

the judging module is used for judging whether an abnormal segment exists in the first video file, wherein the abnormal segment is a part with abnormal sound in the first video file; and if the abnormal segments exist, displaying the starting time points and the ending time points of the abnormal segments on the clipping interface, wherein the starting time points and the ending time points of the abnormal segments are used as bases for the user to input the plurality of groups of clipping time points.

9. The system of claim 8, wherein the determination module is configured to:

10. The system of claim 9, wherein the determination module is configured to: