WO2011110063A1

WO2011110063A1 - Method and system for generating video scene library, method and system for retrieving video scenes

Info

Publication number: WO2011110063A1
Application number: PCT/CN2011/071072
Authority: WO
Inventors: 李平辉
Original assignee: Li Pinghui
Priority date: 2010-03-09
Filing date: 2011-02-18
Publication date: 2011-09-15
Also published as: CN102024009A

Abstract

A method and system for generating a video scene library, a method and system for retrieving video scenes are disclosed. The method for generating the video scene library comprises the following steps: A, performing time anchor marking and caption annotating to the video scenes in video files in a data source; B, extracting the annotated captions to be stored in a caption library; C, according to the marked time anchors, performing redundancy segmenting to the corresponding video files, intercepting the video scene fragments, which correspond to the captions, to be stored in a video scene fragment library; D, establishing the relationship between caption fragments in the caption library and video scene fragments in the video scene library. The solution enables well data support for conveniently and quickly finding target video scene fragments by users.

Description

Video scene library generation method and system, method and system for searching video scene

The present invention relates to the field of video search technology, and in particular, to a method for generating a video scene library and a search method and system for a video scene based on the library. In addition, the present invention also relates to a method and system for directly searching for a video scene. Background technique

With the popularity of the Internet and the development of network technologies, video search technology on the Internet has been widely used today. Users can easily get the video information they want by using a video search engine. Today's video search technology is generally based on keyword search. The video file that meets the search criteria is returned to the user by matching the search of the video file name or the related tag in the video database. For example, if the user enters the keyword "crazy" to perform a video search, then the video files including "Crazy Stone", "Crazy Racing" and the like, including the word "crazy" are search results that meet the search criteria. Even with the more advanced frame search technique, the result is that the search results are returned to the user in units of the entire video file. Today's video search technology does not provide a convenient search function for video clips.

If you want to know how a word or a sentence can be used in many actual movie scenes, for example, a student wants to know "how are you?" in which movie scenes can be used, then in the existing Under the conditions of network technology, he must first judge the "how a re you?" sentence based on experience or other auxiliary conditions, and then use the subtitle search engine and video search engine to search for subtitles of this video. File and video files, after the keyword matching search of the subtitle file determines that the phrase "how are you?" exists in the film, and then locates the "how are you?" sentence by drag and drop or specific playback software. Watch the time period. If the user wants to collect video clips containing the phrase "how are you?", they will need to use video cutting software to cut and collect the video files. By repeating the process described above, the user can collect a number of different video scenes containing the "how are you?" conversation.

Similarly, a user who needs a large amount of video scenes as a material, such as a photographer who wants to refer to many war scene shooting methods, needs some rain scenes as a material for video production enthusiasts, they can only First, judge which video files will appear on this type of video scene through experience or other auxiliary conditions. Then, by watching a large number of these video files, the target video scene is found, and then the video cutting software is used for cutting and collecting.

From the above, it can be seen that under the existing network technology, the user has to spend a large amount of time to obtain a small number of target video scene segments. Today's video search technology does not have the ability to quickly obtain a large number of target video scene segments through keyword search. Summary of the invention

The problem to be solved by the present invention is to provide a method for generating a video scene library, which provides data support for the user to quickly and easily find the target video scene segment.

The invention also provides a generation system of a video scene library corresponding to the above method.

In addition, the present invention also provides a method and system for searching a video scene segment based on a video scene library generated by the above method, so that the user can find the target video scene segment conveniently and quickly.

In addition, the present invention also provides a method and system for directly searching for video scene segments, so that the user can quickly and easily find the target video scene segment.

In addition, the present invention also provides a method and system for generating a video scene, so as to quickly generate a large number of video scene segments.

In order to solve the above technical problems, the present invention uses the following technical solutions:

A method for generating a video scene library, the method comprising the following steps:

A. Perform time anchor annotation and subtitle annotation on the video scene in the video file in the data source;

B. Extracting the subtitle segments of the annotation into the subtitle library;

C. performing redundant cutting on the corresponding video file according to the marked time anchor point, intercepting the video scene segment corresponding to the subtitle, and storing the video scene segment in the video scene segment library;

D. Establish a correspondence between the subtitle segment in the subtitle library and the video scene segment in the video scene library. As a preferred solution of the present invention, in the step A, the caption annotation includes a dialogue/narration in the video scene, or a synonymous explanation or generalization of the dialogue/narration, or a label describing the type of the video scene.

As a preferred solution of the present invention, the step B further includes extracting the time anchor point and related video file information into the subtitle library. As a preferred embodiment of the present invention, the steps C are sequentially interchanged. A system for generating a video scene library, the system comprising:

An annotation unit for performing time anchor annotation and subtitle annotation on the video scene in the video file of the data source;

a subtitle extraction unit, configured to extract the subtitle segments of the annotation into the subtitle library;

a cutting unit, configured to perform redundant cutting on the video file according to the marked time anchor point, intercepting the video scene segment corresponding to the character screen, and storing the video scene segment in the video scene segment library;

The relationship establishing unit is configured to establish a correspondence between the subtitle segment in the subtitle library and the video scene segment in the video scene library.

As a preferred solution of the present invention, in the labeling unit, the subtitle annotation includes a dialogue/narration original in the video scene, or a synonymous explanation or generalization of the dialogue/narration, or a label describing a video scene type; the subtitle extraction The unit further extracts the time anchor point and related video file information into the subtitle library. A method for searching a video scene segment based on a video scene library generated by the above method, the method comprising the following steps:

a, the user inputs a keyword to request a search for a video scene segment;

b. Searching for the matching subtitle segment information in the subtitle library;

c. Returns and matches the video clip associated with the subtitle segment.

As a preferred solution of the present invention, the step a and the step b further include: determining whether to request a search for a video scene of a dialogue or narration type, or requesting a video scene of a description type.

If it is the former, step b searches for a subtitle segment of the dialogue or narration type in the subtitle library; if it is the latter, step b searches the subtitle library for the subtitle segment of the description type.

As a preferred solution of the present invention, the step b and the step c further comprise: determining whether to request to intercept the video scene segment in real time according to the new cutting time redundancy.

If yes, according to the time anchor point corresponding to the matching subtitle segment, the corresponding video file segment is cut and intercepted according to the new cutting time redundancy amount to obtain the corresponding video scene segment; if not, according to the video scene segment library The association relationship with the subtitle library obtains and matches the video scene segment corresponding to the subtitle segment. A system for searching a video scene segment based on a video scene library generated by the above method, the system comprising:

An input unit through which the user inputs information;

a search unit, configured to: when receiving an input unit to initiate a request, search for a video scene segment in the storage unit;

a storage unit, configured to store the generated video scene library, that is, the stored video scene segment library and the subtitle library;

A display unit for displaying video clips that match the search criteria.

As a preferred solution of the present invention, the search system further includes a determining unit, configured to determine whether the search request is for a dialogue or a narration type scene or a description type of the scene, and is used for determining whether to request input of the video scene segment. The cutting time redundancy is re-cut and intercepted.

As a preferred aspect of the present invention, the system further includes a cutting unit for re-cutting the video file by an input cutting time redundancy amount. A method for directly searching for a video scene segment, the method comprising the following steps:

Α', time anchor annotation and subtitle annotation for the video scene in the video file in the data source; Β', extract the time anchor point and subtitle segment of the annotation and related video file information into the subtitle library; C', user input key The word proposes a request to search for a video scene segment;

D', obtaining a matching subtitle segment and its corresponding time anchor point by keyword search;

E', according to the time anchor point and the cutting redundancy, cut and intercept the corresponding video file to obtain the target video scene segment, and return it to the user. A system for directly searching for video scene segments, the system comprising:

a subtitle library extracting unit for extracting the labeled subtitle segment and time anchor point and related video file information Saved in the subtitle library;

An input unit through which the user inputs information;

a search unit, configured to perform a keyword matching search in the subtitle library when receiving the input unit to initiate the request;

a cutting unit, configured to cut a target video scene segment by cutting a video file in the data source according to the time anchor point and the cutting redundancy amount;

A display unit for displaying video clips that match the search criteria. A method for generating a video scene, the method comprising the following steps:

', perform time anchor annotation and subtitle annotation on the video scene in the video file in the data source;

B' ', extracting the labeled subtitle segment and time anchor point into the subtitle library;

0 ', the corresponding video file is redundantly cut according to the marked time anchor point, and the video scene segment corresponding to the subtitle segment is intercepted. A system for generating a video scene, the system comprising:

a caption extraction unit, configured to extract the caption segment and the time anchor point into the caption library;

The cutting unit is configured to perform redundant cutting on the video file according to the marked time anchor point, and intercept the video scene segment corresponding to the subtitle segment. The invention has the beneficial effects of: the method and system for generating a video scene library and the method and system for searching a video scene segment, which can be automatically used in a video file only by spending manpower time for creating a corresponding subtitle file. Video scenes are collected into the video scene library. The video scene library is similar to the font library, the concept of the thesaurus, which contains various video scene segments from various video files and corresponding subtitle segments. Users only need to input keywords in the terminal to search, you can easily get a large number of target video scene clips from different video files, eliminating the need of today's network technology, in order to achieve the same purpose, you need to download or copy a large number of large-volume video files, and then Retrieve in the subtitle file, locate in the video file, cut The trouble of cutting and so on. Compensating the shortcomings of video search engines that can't search video clips, providing a solution for the search of video scenes based on keywords, saving a lot of time for users, especially foreign language learners and video editors, Great convenience. DRAWINGS

FIG. 1 is a flowchart of a method for generating a video scene library according to the present invention.

2 is a flow chart of a video scene search method according to the present invention.

FIG. 3 is a structural diagram of a video scene library of the present invention.

FIG. 4 is a schematic diagram of a video scene library generation system according to the present invention.

FIG. 5 is a schematic diagram of a video scene search system of the present invention.

FIG. 6 is a schematic diagram of a video scene search example of the present invention. detailed description

Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

Embodiment 1

The video file according to the solution of the present invention does not affect the implementation of the present invention in terms of its content, format, type and the like. In the following example, a file of a general English movie video is taken as an example, but the implementation of the solution of the present invention is not limited to the video file of an English movie. For example, the present invention is also applicable to Chinese movies, other foreign language movies, and non-film videos.

Referring to FIG. 1, FIG. 1 discloses a preferred implementation example of a method for generating a video scene library according to the present invention. The method includes the following steps:

[Step 101]

According to the preset framing rule, time anchor annotation and subtitle annotation are performed on each video scene of the video file. A typical framing rule is to use a complete dialogue or narration for each scene unit in the video, and a specific scene as a scene unit. The subtitle content can be a dialogue/narration original text, or a synonym explanation or generalization of the dialogue/narration, corresponding to a video scene of a dialogue or narration type, or a scene description label, corresponding to a descriptive video scene. Take the video file of the film "Forrest Gump" as an example. Assume that the preset framing rule is framing with each complete dialogue or narration in the video, and also framing a specific scene, such as snow scene. , seascape, rain, battle scenes, etc. 4 There are a total of 2000 dialogues and narration in the movie, 50 specific scenes. Then the film was defined with 2050 video scenes. This step requires time anchor annotation and caption annotation for these 2050 video scenes. The time anchors marked are: start time anchor and end time anchor point. The start time anchor point is the point in time at which the video scene starts playing in the video file. The end time anchor point refers to the time point at which the video scene ends playing in the video file; the content of the subtitle note includes the dialogue or narration and the scene description label.

For example, between "00: 32: 46. 634" and "00: 32: 48. 727", there is a dialogue "My name is Forres t Gump". According to the framing rule, a video scene unit is generated centering on this time period. The start time anchor of the annotation is "00: 32: 46. 634", the end time anchor is "00: 32: 48. 727", and the subtitle content of the annotation is the dialogue content "My name is Forres t Gump" ₀ A video scene is a video scene of a dialogue or narration type.

Another example is that between "00: 49: 10. 123" and "00: 51: 06. 351", it is a relatively independent battle scene. According to the framing rule, a video scene unit is generated centering on this time period. The start time anchor of the callout is "00: 49: 10.123", the end time anchor is "00: 51: 06. 351", and the subtitle of the note is the scene description tag "War". This video scene belongs to a video scene of the description type.

In this step, the video scene is marked with a general video subtitle creation technology, and a subtitle file containing a time anchor point and a subtitle segment is output. Subtitle production technology is a mature technology, and I will not comment here.

[Step 102]

Through regular expression matching, all labeled time anchors and subtitles and related video file information are extracted and stored in the subtitle library. The subtitle library can use general commercial database products. Each storage element includes a complete subtitle note and corresponding start time anchor and end time anchor. The main structure of the entire subtitle library is shown in the following table: Serial number start time anchor end time anchor point subtitle information type from video

... ... ... ... ... ...

N 00:07:18.269 00:07:20.438 How are you? Dialogue "Smell the woman"

N+l 00:32:46.634 00:32:48.727 My name is Forrest Gump. Talk "Forrest Gump"

N+2 00:49:10.123 00:51:06.351 War Description "Forrest Gump"

... ... ... ... ... ...

Table 1

[Step 103]

Taking the marked time anchor as the input parameter, recycling the cutting function of the multimedia programming language, such as the related function in the Cut class in the Java Media Architecture (JMF), cutting and capturing the video scene according to the cutting time redundancy of the video file. Fragment. The cutting time redundancy refers to the amount of time that extends forward and backward, centering on the time period in which the target video scene is located. The purpose of setting the cutting time redundancy is to allow the user to know the context information of the target video scene. The cutting time redundancy is generally a function of the subtitle segment text length and the video scene segment time length as independent variables, ie z = f (X, y), where z represents the cutting time redundancy and X is the subtitle segment word. Number or number of words, y is the length of time to mark the video scene. The cutting time redundancy can also be an artificially defined constant. Start cutting of video clips Time point = Start time anchor of the label - Cut time redundancy; End cut time point = End time of the label Anchor + Cut time redundancy.

Take the subtitle unit "My name is Forrest Gump" as an example. The starting time is "00: 07: 18.269" and the ending time anchor is "00: 07: 20.438". Assume that the calculated or predefined cut time redundancy is 3 seconds. Then the video scene from "00: 07: 15.269" to "00: 07: 23.438" will be taken as the target video scene segment.

The video scene segment set obtained by cutting all the cuts is stored in the video scene segment library. Here, the entity corresponding to the concept of the video scene fragment library may be a general commercial database product or a file set in a common operating system.

[Step 104]

Using the database association technology, the association relationship between the video scene segment library and the subtitle library is established, and a video scene library for searching is comprehensively formed. Referring to FIG. 3, each video scene segment has a corresponding subtitle segment.

For each video file of the data source, repeat steps 101 to 104 to display different video files. All video scenes and related subtitles in the clip are included in the video scene library.

In addition, since step 102 and step 103 do not need to be sequentially dependent, the order of step 102 and step 103 can be interchanged. The method for generating a video scene library of the present invention has been described above. The present invention also discloses a method for generating a video scene library corresponding to the above method while revealing the above method. Referring to FIG. 4, the video scene database generation system includes: an annotation unit 401, a subtitle extraction unit 402, a cutting unit 403, and a relationship establishing unit 404.

The labeling unit 401 is configured to perform time anchor annotation and caption annotation on the video scene in the video file of the data source; the caption extraction unit 402 is configured to extract the labeled caption segment and time anchor point and other video file information into the caption library; The unit 403 is configured to perform redundant cutting on the video file according to the marked time anchor point, and intercept the video scene segment corresponding to the subtitle, and store the video scene segment in the video scene segment library; the relationship establishing unit 404 is configured to create the subtitle segment and the video in the subtitle library. Correspondence of video scene segments in the scene library. The working principle of each unit can be referred to the description of the above method, and will not be described here. The video scene library can be obtained by the above method and system for generating a video scene library. The following describes a method for searching a video scene segment based on the video scene library. Referring to FIG. 2 and FIG. 6, the method for searching a video scene segment based on the video scene library of the present invention includes the following steps:

[Step 201]

The user inputs a keyword in the terminal, for example, enter "how are you" in the input box 601 to issue a request to search for a video scene.

[Step 202]

According to option 602, it is determined whether the request is for a video scene of a dialogue or narration type, or for a video scene of a description type, clicking the search video scene button 603. If it is a video scene for a dialogue or narration type, step 203 is performed; if it is a video scene for the description type, step 204 is performed.

[Step 203]

Search for subtitles in the subtitles library for matching dialogues or narration types. For example, the keyword entered is "how are you", then "how are you?" from "Scent Of A Woman", "Bad Lieutenant's "how are you do ing?", "Mona Li sa Smi le"'s "Hi. How are you?" and so on are matching target subtitle segments. The matching result is displayed in list item 604.

[Step 204]

Search for matching descriptive subtitle segments in the subtitles library. For example, the keyword entered is "war", from (Forres t Gump» ((Independence Day» «Ava tar» and so on, the subtitle segments corresponding to the war video scene segment are matching target subtitle segments.

[Step 205]

According to the selection of the default play 605, or the cut play 606, it is determined whether the request directly returns the relevant video scene segment from the video scene library, or the video file segment is obtained by cutting the video file in real time according to the newly input cutting time redundancy. If it is required to return directly from the video scene library, step 206 is performed, and if real-time cutting is required, step 207 is performed.

[Step 206]

According to the association relationship between the video scene segment library and the subtitle library in the video scene library, the corresponding target video scene segment is returned by searching for the matched subtitle segment. For example, "how are you?" in "Scent Of A Woman", "how are you do ing?" in "Bad L ieutenant", "Hi. How are you?" in "Mona Li sa Smi le", etc. The target video scene segment corresponding to the matched subtitle segment.

[Step 207]

The time anchor point is further obtained by searching for the matched subtitle segment. For example, "S are a you?" The "how are you?" time anchors are "00: 07: 18. 269" and "00: 07: 20. 438". If the user selects cut play 606, the time redundancy re-entered in system pop-up box 607 is 5 seconds. After clicking cut play, then the program will be ((Scent Of A Woman)) from "00: 07: 13. The 269" to "00: 07: 25. 438" video clip is cut and returned to the search user.

[Step 208]

The searched target video scene segment is displayed, referring to the video play interface 608. The method for searching a video scene segment based on the video scene library of the present invention is described above. The present invention also discloses a system for searching a video scene segment based on a video scene library corresponding to the above method. Referring to FIG. 5 , The search system includes: an input unit 501, a determination unit 502, and a search The unit 503, the cutting unit 504, the storage unit 505, and the display unit 506.

The input unit 501 is configured to acquire information input by the user, including keywords, cutting time redundancy. The determining unit 502 is configured to determine whether to search for a video scene of a type of dialogue or narration or a video scene of a description type. If it is the former, the search unit is called to search for the video scene of the dialogue or narration type in the video scene library in the storage unit; if the latter, the search unit is called for the description type in the video scene library in the storage unit. Video scenes are searched. The determining unit 502 is further configured to determine whether it is required to perform real-time cutting on the video scene segment according to the input cutting time redundancy, such as cutting time redundancy. If real-time cutting is not required, the relevant video scene segment is returned directly from the video scene library in the storage unit.

The searching unit 503 is configured to receive a search for the subtitle and the video scene segment in the storage unit when the determining unit initiates the request.

The cutting unit 504 is configured to re-cut the video file in the storage unit according to the input cutting time redundancy amount when receiving the requesting unit to initiate the request.

The storage unit 505 is configured to store the generated video scene library and the data source video file of the generated video scene library.

The display unit 506 is for displaying a video scene segment that matches the search condition.

Through the method and system for generating a video scene library as described in this example, a video scene library similar to a font library and a thesaurus concept can be obtained. The library contains various video scene segments from various video files and corresponding subtitle segments. According to the method for searching a video scene based on the video scene library in this example, the user can easily obtain a large number of target video scene segments from different video files by inputting keywords in the terminal, thereby eliminating the current network technology. In order to achieve the same purpose, it is necessary to download or copy a large number of large-volume video files, and then search in the subtitle file, and locate, cut, etc. in the video file. Embodiment 2

The difference between this embodiment and the first embodiment is that, in this embodiment, the video scene library is not constructed, and when the user requests to search for a video scene segment, the time anchor point corresponding to the searched subtitle segment is in the data source. The video file is cut in real time, and the target video scene segment is returned to the user. Various steps For the technical details and the working principle of each unit, refer to the first embodiment, and no further details are provided herein. A method for directly searching for a video scene segment, the method comprising the following steps:

E', according to the time anchor point and the cutting redundancy amount, the corresponding video file is cut and intercepted to obtain the target video scene segment, and returned to the user; the above method corresponds to a system for directly searching for a video scene segment, the system includes: Unit, subtitle library extraction unit, input unit, search unit, cutting unit, display unit.

The labeling unit is configured to perform time anchor annotation and subtitle annotation on the video scene in the video file of the data source;

The subtitle library extracting unit is configured to extract the labeled subtitle segment and the time anchor point and the related video file information into the subtitle library;

The input unit user inputs information through the input unit;

The search unit is configured to perform a keyword matching search in the subtitle library when receiving the input unit to initiate the request;

The cutting unit is configured to intercept the target video scene segment from the video file in the data source according to the time anchor point and the cutting redundancy amount;

The display unit is used to display a video clip that matches the search criteria.

The method and system for directly searching for video scene segments described in this example do not need to rely on the video scene library. The user can easily obtain a large number of target video scenes from real-time cutting and intercepting different video files by inputting keywords in the terminal for searching. Fragments, eliminating the need for today's network technology, in order to achieve the same purpose, you need to download or copy a large number of large-volume video files, and then search in the subtitle file, positioning, cutting and so on in the video file. Embodiment 3 The difference between this embodiment and the above two embodiments is that the embodiment is only a method and system for generating a video scene. For the technical details of the various steps and the working principle of each unit, reference may be made to the first embodiment, and details are not described herein. A method for generating a video scene disclosed in this embodiment, where the method includes the following steps:

0 ', the corresponding video file is redundantly cut according to the marked time anchor point, and the video scene segment corresponding to the subtitle segment is intercepted.

A system for generating a video scene corresponding to the above method, the system comprising an annotation unit, a subtitle extraction unit, and a cutting unit.

The caption extraction unit is configured to extract the labeled caption segment and the time anchor point and store the caption library in the caption library;

The cutting unit is configured to perform redundant cutting on the video file according to the marked time anchor point, and intercept the video scene segment corresponding to the subtitle segment.

Through the method and system described in this example, a large number of video scene segments can be obtained conveniently and quickly, and a large amount of video scene material is provided for the related work of the user. The description and application of the present invention are intended to be illustrative, and not intended to limit the scope of the invention. Variations and modifications of the embodiments disclosed herein are possible, and various alternative and equivalent components of the embodiments are well known to those of ordinary skill in the art. It is apparent to those skilled in the art that the present invention may be embodied in other forms, configurations, arrangements, ratios, and other components, materials and components without departing from the spirit or essential characteristics of the invention. Other variations and modifications of the embodiments disclosed herein may be made without departing from the scope and spirit of the invention.

Claims

The present invention provides a method for generating a video scene library, the method comprising the following steps:

C. Perform redundant cutting on the corresponding video file according to the marked time anchor point, intercept the video scene segment corresponding to the subtitle, and store it in the video scene segment library;

D. Establish a correspondence between the subtitle segment in the subtitle library and the video scene segment in the video scene library. The method for generating a video scene library according to claim 1, wherein:

In the step A, the subtitle notes include a synonym explanation or generalization of the dialogue/narration text in the video scene, or a dialogue/narration, or a label describing the type of the video scene. The method for generating a video scene library according to claim 1, wherein:

The step B further includes extracting the time anchor point and related video file information into the subtitle library. The method for generating a video scene library according to claim 1, wherein:

The step ^ step C is sequentially exchanged. A system for generating a video scene library, the system comprising:

An annotation unit, configured to perform time anchor annotation and subtitle annotation on the video scene in the video file of the data source;

a cutting unit, configured to perform redundant cutting on the video file according to the marked time anchor point, and intercept the video scene segment corresponding to the subtitle, and store the video scene segment into the video scene segment library;

The relationship establishing unit is configured to establish a correspondence between the subtitle segment in the subtitle library and the video scene segment in the video scene library. The system for generating a video scene library according to claim 5, characterized in that:

In the labeling unit, the caption note includes a synonym explanation or generalization of the dialogue/narration text in the video scene, or a dialogue/narration, or a label describing the type of the video scene;

The caption extraction unit further extracts the time anchor point and related video file information into the caption library. A method for searching for a video scene segment based on a video scene library generated by the method of claim 1, wherein the method comprises the following steps:

a, the user inputs a keyword to request a search for a video scene segment;

c. Returns and matches the video clip associated with the subtitle segment. The search method according to claim 7, wherein:

The step a and the step b further include:

Judging whether to request a search for a video scene of a dialogue or narration type, or request a video scene of a description type;

If it is the former, step b searches for a subtitle segment of the dialogue or narration type in the subtitle library; if it is the latter, step b searches the subtitle library for the subtitle segment of the description type. The search method according to claim 7, wherein:

The step b and the step c further include:

Determining whether to request cutting the video scene segment in real time according to the new cutting time redundancy; if yes, according to the time anchor point corresponding to the matching subtitle segment, cutting and intercepting the corresponding video file according to the new cutting time redundancy to obtain the corresponding video a scene segment; if not, obtaining and matching a video scene segment corresponding to the subtitle segment according to the association relationship between the video scene segment library and the subtitle library. A system for searching a video scene segment based on a video scene library generated by the method of claim 1, wherein the system comprises: An input unit through which the user inputs information;

a search unit, configured to: when receiving an input unit to initiate a request, searching for a video scene segment in the storage unit;

a storage unit, configured to store the generated video scene library, that is, the video scene fragment library and the subtitle library are stored;

A display unit for displaying video clips that match the search criteria. The search system according to claim 10, characterized in that:

The search system further includes a determining unit, configured to determine whether the search request is for a dialogue or a narration type scene or a description type of scene, and is used for determining whether the video scene segment is required to be re-cut and intercepted according to the input cutting time redundancy amount. . The search system according to claim 10, characterized in that:

The system further includes a cutting unit for re-cutting the video file by the input cutting time redundancy. A method for directly searching for a video scene segment, the method comprising the steps of:

Α', time anchor annotation and subtitle notes for video scenes in video files in the data source; Β', extract timed anchor points and subtitle segments and other video file information into the subtitle library; C', user input key The word proposes a request to search for a video scene segment;

E', according to the time anchor point and the cutting redundancy, cut and intercept the corresponding video file to obtain the target video scene segment, and return it to the user. A system for directly searching for a video scene segment, wherein the system includes:

a subtitle library extraction unit for extracting the labeled subtitle segments and time anchor points and other video files Information is stored in the subtitle library;

An input unit through which the user inputs information;

a search unit, configured to perform keyword matching retrieval in the subtitle library when receiving the input unit to initiate the request;

a cutting unit, configured to cut a target video scene segment by cutting a video file in the data source according to a time anchor point and a cutting redundancy amount;

A display unit for displaying video clips that match the search criteria.

A method for generating a video scene, the method comprising the following steps:

', perform time anchor annotation and subtitle annotation on the video scene in the video file in the data source; B' ', extract the subtitle segment and time anchor point of the annotation into the subtitle library;

16. A system for generating a video scene, the system comprising: