CN109783821B - Method and system for searching video of specific content - Google Patents

Method and system for searching video of specific content Download PDF

Info

Publication number
CN109783821B
CN109783821B CN201910047102.1A CN201910047102A CN109783821B CN 109783821 B CN109783821 B CN 109783821B CN 201910047102 A CN201910047102 A CN 201910047102A CN 109783821 B CN109783821 B CN 109783821B
Authority
CN
China
Prior art keywords
video
information
semantic
regular expression
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910047102.1A
Other languages
Chinese (zh)
Other versions
CN109783821A (en
Inventor
魏誉荧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Genius Technology Co Ltd
Original Assignee
Guangdong Genius Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Genius Technology Co Ltd filed Critical Guangdong Genius Technology Co Ltd
Priority to CN201910047102.1A priority Critical patent/CN109783821B/en
Publication of CN109783821A publication Critical patent/CN109783821A/en
Application granted granted Critical
Publication of CN109783821B publication Critical patent/CN109783821B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a method and a system for searching videos of specific contents, wherein the method comprises the following steps: acquiring a video sample, and generating corresponding video text information and video waveforms according to the video sample; establishing a voice waveform library according to the video text information and the video waveforms; establishing a semantic slot and a regular expression library according to the video text information; obtaining video voice, and comparing the video voice with a voice waveform library to obtain video information corresponding to the video voice; matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and marking video voice by taking the video content semantics as video marking information; and acquiring user search information, matching the user search information with video mark information, and determining a user target video. The invention marks the video voice, which is convenient for the user to search, thereby quickly and accurately searching the video with specific content.

Description

Method and system for searching video of specific content
Technical Field
The invention relates to the technical field of voice recognition, in particular to a method and a system for searching videos with specific contents.
Background
At present, in an information explosion era, various information resources exist on a network, but the number of resources on the network is not adequate at present, so that the difficulty of searching the required data of a user is high. The traditional video searching mode is to directly locate the video through voice, so that on one hand, the technical difficulty is high, and on the other hand, the direct matching voice searching can be inaccurate. Therefore, a method and a system for searching for video of specific content are needed.
Disclosure of Invention
The invention aims to provide a method and a system for searching videos of specific contents, which are used for marking video voices so as to facilitate a user to search, thereby quickly and accurately searching the videos of the specific contents.
The technical scheme provided by the invention is as follows:
the invention provides a method for searching videos of specific contents, which comprises the following steps:
acquiring a video sample, and generating corresponding video text information and video waveforms according to the video sample;
establishing a voice waveform library according to the video text information and the video waveform;
establishing a semantic slot and a regular expression library according to the video text information;
obtaining video voice, and comparing the video voice with the voice waveform library to obtain video information corresponding to the video voice;
Matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and marking the video voice by taking the video content semantics as video marking information;
and acquiring user search information, matching the user search information with the video mark information, and determining a user target video.
Further, the establishing the semantic slot and the regular expression library according to the video text information specifically includes:
the video text information is segmented through a segmentation technology to obtain corresponding segmented text and segmented part of speech corresponding to the segmented text;
analyzing sentence structure in the video text information to obtain connection relation between the word segmentation texts;
determining semantic word segmentation of the video sample corresponding to the video text information according to the word segmentation text, the word segmentation part of speech and the connection relation, wherein the word segmentation text contains the semantic word segmentation;
establishing a semantic slot according to the video text information and the semantic word, and establishing a corresponding relation between the video text information and the semantic word in the semantic slot;
generating a corresponding regular expression according to the word segmentation text, the word segmentation part of speech and the connection relation;
And establishing the regular expression library according to the video text information and the regular expression, and establishing a corresponding relation between the video text information and the regular expression in the regular expression library.
Further, the matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and marking the video voice with the video content semantics as video marking information specifically includes:
the video information is segmented through a segmentation technology to obtain video segmentation, and a video regular pattern is generated according to the video information;
matching the video segmentation words with the semantic segmentation words of the semantic slots, and matching the video regular expressions with regular expressions in the regular expression library;
if the video segmentation word and the video regular expression are matched and matched, the corresponding semantic segmentation word in the semantic slot is used as the semantic of the video content;
if the video regular patterns are matched and matched, and the video word segmentation matching results are not matched, determining the video content semantics of the video information, and updating the semantic slots according to the video content semantics;
And marking the video voice by taking the video content semantic as video marking information.
Further, the obtaining the user search information, and matching the user search information with the video tag information, the determining the user target video specifically includes:
acquiring user search information;
and when the user searching information is a keyword, matching the keyword with the video marking information, and determining the user target video.
Further, the obtaining the user search information, matching the user search information with the video tag information, and determining the user target video further includes:
when the user search information is a key sentence, matching the key sentence with the regular expression library, determining semantic segmentation corresponding to the semantic slot according to a matched regular expression, obtaining corresponding video mark information, and determining the user target video.
The invention also provides a searching system of the video of the specific content, which comprises the following steps:
the sample acquisition module acquires a video sample and generates corresponding video text information and video waveforms according to the video sample;
The waveform library establishing module is used for establishing a voice waveform library according to the video text information and the video waveform obtained by the sample obtaining module;
the database establishing module is used for establishing a semantic slot and a regular expression library according to the video text information obtained by the sample obtaining module;
the voice acquisition module acquires video voice, compares the video voice with the voice waveform library established by the waveform library establishment module, and obtains video information corresponding to the video voice;
the marking module is used for matching the video information obtained by the voice obtaining module with the semantic slots and the regular expression library established by the database establishing module to obtain video content semantics, and marking the video voice by taking the video content semantics as video marking information;
and the searching module is used for acquiring user searching information, matching the user searching information with the video marking information obtained by the marking module and determining a user target video.
Further, the database creation module specifically includes:
the word segmentation unit is used for segmenting the video text information obtained by the sample obtaining module through a word segmentation technology to obtain a corresponding word segmentation text and a word segmentation part of speech corresponding to the word segmentation text;
The relation analysis unit is used for analyzing sentence structures in the video text information obtained by the sample acquisition module to obtain connection relations among the word segmentation texts;
the processing unit is used for determining semantic word segmentation of the video sample corresponding to the video text information according to the word segmentation text obtained by the word segmentation unit, the word segmentation part of speech and the connection relation obtained by the relation analysis unit, wherein the word segmentation text comprises the semantic word segmentation;
the semantic slot establishing unit establishes the semantic slot according to the video text information obtained by the sample obtaining module and the semantic word obtained by the processing unit, and establishes the corresponding relation between the video text information and the semantic word in the semantic slot;
the generation unit is used for generating a corresponding regular expression according to the word segmentation text, the word segmentation part of speech and the connection relation obtained by the relation analysis unit;
the database establishing unit establishes the regular expression library according to the video text information obtained by the sample obtaining module and the regular expression generated by the generating unit, and establishes the corresponding relation between the video text information and the regular expression in the regular expression library.
Further, the marking module specifically includes:
the control unit is used for segmenting the video information through a word segmentation technology to obtain video segmentation, and generating a video regular pattern according to the video information;
the matching unit is used for matching the video segmentation obtained by the control unit with the semantic segmentation of the semantic slot and matching the video regular expression obtained by the control unit with the regular expression in the regular expression library;
the semantic determining unit is used for taking the corresponding semantic segmentation word in the semantic slot as the semantic of the video content if the video segmentation word obtained by the matching unit is matched with the video regular expression;
the semantic determining unit is used for determining the video content semantics of the video information and updating the semantic slots according to the video content semantics if the video regular pattern matching obtained by the matching unit is consistent and the video word segmentation matching result is not consistent;
and the marking unit marks the video voice by taking the video content semantics determined by the semantics determining unit as video marking information.
Further, the searching module specifically includes:
An acquisition unit that acquires user search information;
and the searching unit is used for matching the keyword with the video mark information when the user search information acquired by the acquiring unit is the keyword, and determining the user target video.
Further, the search module further includes:
and the searching unit is used for matching the key sentences with the regular expression library when the user searching information is the key sentences, determining the semantic segmentation corresponding to the semantic slots according to the matched regular expressions to obtain corresponding video mark information and determining the user target video.
The method and the system for searching the video with the specific content provided by the invention have at least one of the following beneficial effects:
1. according to the method, a large number of video samples are obtained to summarize the semantic slots and the regular expression library, then the video content semantics of the obtained video voice are analyzed according to the semantic slots and the regular expression library, and content marking is carried out, so that a user can quickly and accurately find a user target video.
2. According to the method, the semantic slot and the regular expression library are built by acquiring the video sample, analyzing the video sample, segmenting the video text information corresponding to the video sample by the segmentation technology, and analyzing the sentence structure of the video text information.
Drawings
The foregoing features, technical features, advantages and implementation of a video searching method and system for specific content will be further described in the following description of preferred embodiments with reference to the accompanying drawings in a clearly understandable manner.
FIG. 1 is a flow chart of one embodiment of a method of searching for video of a particular content of the present invention;
FIG. 2 is a flow chart of another embodiment of a method of searching for video of a particular content of the present invention;
FIG. 3 is a flow chart of another embodiment of a method of searching for video of a particular content of the present invention;
FIG. 4 is a flow chart of another embodiment of a method of searching for video of a particular content of the present invention;
FIG. 5 is a flow chart of another embodiment of a method of searching for video of a particular content of the present invention;
FIG. 6 is a schematic diagram of one embodiment of a video search system for specific content in accordance with the present invention;
fig. 7 is a schematic diagram of a search system for video of a specific content according to another embodiment of the present invention.
Reference numerals illustrate:
searching system for video of 100 specific contents
110 sample acquisition module
120 waveform library acquisition module
130 database building module 131 word segmentation unit 132 relation analysis unit 133 processing unit
134 semantic slot building unit 135 generating unit 136 database building unit
140 voice acquisition module
150 marking module 151 control unit 152 matching unit 153 semantic determining unit 154 marking unit
160 search module 161 acquisition unit 162 search unit
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will explain specific embodiments of the present invention with reference to the drawings in the specification. It is evident that the drawings in the following description are only examples of the invention, from which other drawings and other embodiments can be obtained, without inventive effort for a person skilled in the art.
For the sake of simplicity of the drawing, the parts relevant to the present invention are shown only schematically in the figures, which do not represent their actual structure as a product. Additionally, in order to simplify the drawing for ease of understanding, components having the same structure or function in some of the drawings are shown schematically with only one of them, or only one of them is labeled. Herein, "a" means not only "only this one" but also "more than one" case.
In one embodiment of the present invention, as shown in fig. 1, a method for searching a video of a specific content includes:
s100, acquiring a video sample, and generating corresponding video text information and video waveforms according to the video sample.
Specifically, a video sample is obtained, and the video sample is an audio file, so that the video sample is converted into corresponding video text information, and a video waveform corresponding to the video sample is obtained.
S200, establishing a voice waveform library according to the video text information and the video waveform.
Specifically, a voice waveform library is established according to the video text information and the video waveform, and a corresponding relation between the video text information and the video waveform is established in the voice waveform library. When the content contained in the video text information is more, the video text information can be appropriately split to obtain segment text information, and correspondingly, the video waveform is also split to obtain a video segment waveform corresponding to the segment text information, and similarly, the corresponding relation between the segment text information and the video segment waveform is established in the voice waveform library.
S300, a semantic slot and a regular expression library are established according to the video text information.
Specifically, the sentence structure of the video text information and the habit of the included words are analyzed, so that a semantic slot and a regular expression library are established.
S400, video voice is obtained, and the video voice and the voice waveform library are compared to obtain video information corresponding to the video voice.
Specifically, video voice is obtained, the waveform of the video voice is compared with the video waveform in the voice waveform library, and when the comparison is consistent, the video text information corresponding to the video waveform is the video information corresponding to the video voice.
S500, matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and marking the video voice by taking the video content semantics as video marking information.
Specifically, matching the video information with a semantic slot and a regular expression library, so as to determine the video content semantics corresponding to the video voice, and marking the video voice by taking the video content semantics as video marking information.
S600, acquiring user search information, matching the user search information with the video mark information, and determining a user target video.
Specifically, user search information input by a user is obtained, the user search information is matched with video mark information of each video voice one by one, and the matched video voices serve as user target videos searched by the user.
In the embodiment, a large number of video samples are obtained to summarize a semantic slot and a regular expression library, then the video content semantics of the obtained video voice are analyzed according to the semantic slot and the regular expression library, and content marking is performed, so that a user can quickly and accurately find a user target video.
Another embodiment of the present invention, which is an optimized embodiment of the above embodiment, as shown in fig. 2, includes:
s100, acquiring a video sample, and generating corresponding video text information and video waveforms according to the video sample.
S200, establishing a voice waveform library according to the video text information and the video waveform.
S300, a semantic slot and a regular expression library are established according to the video text information.
The step S300 of establishing a semantic slot and a regular expression library according to the video text information specifically comprises the following steps:
s310, word segmentation is carried out on the video text information through a word segmentation technology to obtain corresponding word segmentation text and word segmentation part of speech corresponding to the word segmentation text.
S320, analyzing sentence structures in the video text information to obtain connection relations among the word segmentation texts.
Specifically, the video text information is segmented by a word segmentation technology, the video text information is split into word segmentation texts corresponding to characters, words, sentences and the like, and the word segmentation part of speech corresponding to each word segmentation text is analyzed. And sentence structure in the video text information obtains connection relation between the word segmentation texts, such as centering relation and the like.
S330, determining semantic word segmentation of the video sample corresponding to the video text information according to the word segmentation text, the word segmentation part of speech and the connection relation, wherein the word segmentation text contains the semantic word segmentation.
Specifically, determining the semantics of the video text information according to the word segmentation text, the word segmentation part of speech and the connection relation, and taking one or more word segmentation texts as the semantic word of the video sample corresponding to the video text information, namely, the semantic word is the semantics of the extracted video sample.
S340, establishing a semantic slot according to the video text information and the semantic word, and establishing a corresponding relation between the video text information and the semantic word in the semantic slot.
Specifically, a semantic slot is established according to the video text information and the semantic word, and a corresponding relation between the video text information and the semantic word is established in the semantic slot, so that the subsequent searching according to the corresponding relation is facilitated.
S350, generating a corresponding regular expression according to the word segmentation text, the word segmentation part of speech and the connection relation.
Specifically, a corresponding regular expression is generated according to the word segmentation text, the word segmentation part of speech and the connection relation, for example, whether the corresponding word segmentation text is replaced by the word segmentation part of speech in the regular expression is determined according to the word segmentation part of speech, the word segmentation text determined to be the semantic word is still represented by the original word segmentation text in the regular expression, and the connection relation between the word segmentation texts is still reserved in the regular expression.
S360, the regular expression library is established according to the video text information and the regular expression, and the corresponding relation between the video text information and the regular expression is established in the regular expression library.
Specifically, a regular expression library is established according to the video text information and the regular expression, and the corresponding relation between the video text information and the regular expression is established in the regular expression library, so that the subsequent searching according to the corresponding relation is facilitated.
S400, video voice is obtained, and the video voice and the voice waveform library are compared to obtain video information corresponding to the video voice.
S500, matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and marking the video voice by taking the video content semantics as video marking information.
S600, acquiring user search information, matching the user search information with the video mark information, and determining a user target video.
In this embodiment, a video sample is obtained and analyzed, a word segmentation technique is used to segment video text information corresponding to the video sample, and the sentence structure of the video text information is analyzed, so as to establish a semantic slot and a regular expression library.
Another embodiment of the present invention, which is an optimized embodiment of the above embodiment, as shown in fig. 3, includes:
s100, acquiring a video sample, and generating corresponding video text information and video waveforms according to the video sample.
S200, establishing a voice waveform library according to the video text information and the video waveform.
S300, a semantic slot and a regular expression library are established according to the video text information.
The step S300 of establishing a semantic slot and a regular expression library according to the video text information specifically comprises the following steps:
s310, word segmentation is carried out on the video text information through a word segmentation technology to obtain corresponding word segmentation text and word segmentation part of speech corresponding to the word segmentation text.
S320, analyzing sentence structures in the video text information to obtain connection relations among the word segmentation texts.
S330, determining semantic word segmentation of the video sample corresponding to the video text information according to the word segmentation text, the word segmentation part of speech and the connection relation, wherein the word segmentation text contains the semantic word segmentation.
S340, establishing a semantic slot according to the video text information and the semantic word, and establishing a corresponding relation between the video text information and the semantic word in the semantic slot.
S350, generating a corresponding regular expression according to the word segmentation text, the word segmentation part of speech and the connection relation.
S360, the regular expression library is established according to the video text information and the regular expression, and the corresponding relation between the video text information and the regular expression is established in the regular expression library.
S400, video voice is obtained, and the video voice and the voice waveform library are compared to obtain video information corresponding to the video voice.
S500, matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and marking the video voice by taking the video content semantics as video marking information.
The step S500 of matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and the step of marking the video voice by using the video content semantics as video marking information specifically comprises the following steps:
s510, word segmentation is carried out on the video information through a word segmentation technology to obtain video word segmentation, and a video regular expression is generated according to the video information.
S520, matching the video segmentation words with the semantic segmentation words of the semantic slots, and matching the video regular expressions with regular expressions in the regular expression library.
Specifically, the video information is segmented by a word segmentation technology, the video information is split into words, sentences and other video segmentation words, and the video segmentation words contained in the video information, the part of speech corresponding to the video segmentation words and the connection relation among the part of speech corresponding to the video segmentation words are analyzed, so that a video regular formula corresponding to the video information is generated. Matching the video segmentation words with the semantic segmentation words of the semantic slots one by one, and matching the video regular expressions with regular expressions in a regular expression library one by one.
And S530, if the video segmentation word and the video regular expression are matched and matched, using the corresponding semantic segmentation word in the semantic slot as the semantic of the video content.
Specifically, if a certain semantic word of the video word and the semantic slot match and a certain regular expression in the video regular expression and the regular expression library also match, the semantic word in the matched semantic slot is taken as the acquired video content semantic corresponding to the video voice.
S540, if the video regular pattern matching is consistent, and the video word segmentation matching result is not consistent, determining the video content semantics of the video information, and updating the semantic slots according to the video content semantics.
Specifically, if a video regular expression matches a regular expression in the regular expression library, but all semantic word matches in the video word and the semantic slot do not match, the video information is analyzed to determine the video content semantics, and the semantic slot is updated according to the determined video content semantics.
S550 marks the video content semantics as video marking information for the video voice.
Specifically, the determined video content semantics are used as video marking information to mark video voice, so that the subsequent user can find the video voice conveniently.
S600, acquiring user search information, matching the user search information with the video mark information, and determining a user target video.
In this embodiment, the obtained video voice is matched with the established semantic slots and the regular expression library, so as to determine the video content semantics corresponding to the video voice, and the video voice is marked as video marking information.
Another embodiment of the present invention, which is an optimized embodiment of the above embodiment, as shown in fig. 4, includes:
s100, acquiring a video sample, and generating corresponding video text information and video waveforms according to the video sample.
S200, establishing a voice waveform library according to the video text information and the video waveform.
S300, a semantic slot and a regular expression library are established according to the video text information.
S400, video voice is obtained, and the video voice and the voice waveform library are compared to obtain video information corresponding to the video voice.
S500, matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and marking the video voice by taking the video content semantics as video marking information.
S600, acquiring user search information, matching the user search information with the video mark information, and determining a user target video.
The step S600 of obtaining user search information, matching the user search information with the video mark information, and determining the user target video specifically includes:
s610 acquires user search information.
Specifically, the user search information is obtained, and because the voice is one of the main modes of man-machine interaction at present, when the obtained user search information is input by the user through voice, the user search information needs to be converted into corresponding text information.
And S620, when the user search information is a keyword, matching the keyword with the video mark information to determine the user target video.
Specifically, if the user search information input by the user is a keyword, matching the acquired keyword with the video mark information corresponding to each video voice, and determining a user target video. For example, if the user directly inputs the user search information "trigonometric function", the "trigonometric function" is matched with the video mark information corresponding to each video voice, so as to obtain the video voice with the video mark information of "trigonometric function".
In this embodiment, when the obtained user search information is a keyword, the keyword is directly matched with the video tag information corresponding to each video voice, so as to quickly and accurately determine the user target video searched by the user.
Another embodiment of the present invention, which is an optimized embodiment of the above embodiment, as shown in fig. 5, includes:
s100, acquiring a video sample, and generating corresponding video text information and video waveforms according to the video sample.
S200, establishing a voice waveform library according to the video text information and the video waveform.
S300, a semantic slot and a regular expression library are established according to the video text information.
S400, video voice is obtained, and the video voice and the voice waveform library are compared to obtain video information corresponding to the video voice.
S500, matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and marking the video voice by taking the video content semantics as video marking information.
S600, acquiring user search information, matching the user search information with the video mark information, and determining a user target video.
The step S600 of obtaining user search information, matching the user search information with the video mark information, and determining the user target video specifically includes:
s610 acquires user search information.
And S620, when the user search information is a keyword, matching the keyword with the video mark information to determine the user target video.
And S630, when the user search information is a key sentence, matching the key sentence with the regular expression library, determining semantic segmentation corresponding to the semantic slot according to a matched regular expression, obtaining corresponding video mark information, and determining the user target video.
Specifically, when the user search information is a key sentence, matching the key sentence with a regular expression library, determining video text information corresponding to a regular expression in the regular expression library, which is matched with the video text information, according to the corresponding relation between the video text information and the regular expression in the regular expression library, and then determining corresponding semantic segmentation according to the corresponding relation between the video text information and the semantic segmentation in a semantic slot, so as to obtain corresponding video mark information, and determining a user target video.
For example, if the user inputs the user search information "find video voice about trigonometric function", the "find video voice about trigonometric function" is converted into a regular expression and a regular expression library to be matched, the matching matches to obtain the corresponding semantic segmentation as "trigonometric function", and the matching is performed with the video mark information corresponding to each video voice to obtain the video voice with the video mark information as "trigonometric function".
In this embodiment, when the obtained user search information is a key sentence, the key sentence is first converted into a corresponding regular expression, and the corresponding regular expression is matched with a regular expression in a regular expression library, and if the matching is consistent, the corresponding video mark information is determined, so that the user target video searched by the user is quickly and accurately determined.
As shown in fig. 6, according to an embodiment of the present invention, a search system 100 for a video of a specific content includes:
the sample acquisition module 110 acquires a video sample, and generates corresponding video text information and video waveforms according to the video sample.
Specifically, the sample acquiring module 110 acquires a video sample, where the video sample is an audio file, so that the video sample is converted into corresponding video text information, and acquires a video waveform corresponding to the video sample.
The waveform library establishing module 120 establishes a voice waveform library according to the video text information and the video waveform obtained by the sample obtaining module 110.
Specifically, the waveform library establishment module 120 establishes a voice waveform library according to the video text information and the video waveform, and establishes a correspondence between the video text information and the video waveform in the voice waveform library. When the content contained in the video text information is more, the video text information can be appropriately split to obtain segment text information, and correspondingly, the video waveform is also split to obtain a video segment waveform corresponding to the segment text information, and similarly, the corresponding relation between the segment text information and the video segment waveform is established in the voice waveform library.
And a database establishing module 130, configured to establish a semantic slot and a regular expression library according to the video text information obtained by the sample obtaining module 110.
Specifically, the database creation module 130 analyzes sentence structure of the video text information and habit of the included words, thereby creating a semantic slot and a regular expression library.
The voice obtaining module 140 obtains video voice, and compares the video voice with the voice waveform library established by the waveform library establishing module 120 to obtain video information corresponding to the video voice.
Specifically, the voice obtaining module 140 obtains a video voice, compares the waveform of the video voice with the video waveform in the voice waveform library, and when the comparison is consistent, the video text information corresponding to the video waveform is the video information corresponding to the video voice.
The marking module 150 is configured to match the video information obtained by the voice obtaining module 140 with the semantic slots and the regular expression library established by the database establishing module 130, obtain video content semantics, and mark the video content semantics as video marking information for the video voice.
Specifically, the marking module 150 matches the video information with the semantic slots and the regular expression library, so as to determine the video content semantics corresponding to the video voice, and marks the video voice by using the video content semantics as the video marking information.
The searching module 160 obtains user search information, matches the user search information with the video marking information obtained by the marking module 150, and determines a user target video.
Specifically, the search module 160 obtains user search information input by the user, and matches the user search information with the video tag information of each video voice one by one, where the matched video voice is the user target video searched by the user.
In the embodiment, a large number of video samples are obtained to summarize a semantic slot and a regular expression library, then the video content semantics of the obtained video voice are analyzed according to the semantic slot and the regular expression library, and content marking is performed, so that a user can quickly and accurately find a user target video.
Another embodiment of the present invention, which is an optimized embodiment of the above embodiment, as shown in fig. 7, includes:
the sample acquisition module 110 acquires a video sample, and generates corresponding video text information and video waveforms according to the video sample.
The waveform library establishing module 120 establishes a voice waveform library according to the video text information and the video waveform obtained by the sample obtaining module 110.
And a database establishing module 130, configured to establish a semantic slot and a regular expression library according to the video text information obtained by the sample obtaining module 110.
The database creation module 130 specifically includes:
the word segmentation unit 131 performs word segmentation on the video text information obtained by the sample obtaining module 110 through a word segmentation technology to obtain a corresponding word segmentation text, and a word segmentation part of speech corresponding to the word segmentation text.
And a relation analysis unit 132 for analyzing the sentence structure in the video text information obtained by the sample obtaining module 110 to obtain the connection relation between the word segmentation texts.
Specifically, the word segmentation unit 131 performs word segmentation on the video text information by using a word segmentation technique, and splits the video text information into word segmentation texts corresponding to characters, words, sentences and the like, and the relationship analysis unit 132 analyzes word segmentation parts of speech corresponding to each word segmentation text. And sentence structure in the video text information obtains connection relation between the word segmentation texts, such as centering relation and the like.
The processing unit 133 determines a semantic word of the video sample corresponding to the video text information according to the word segmentation text obtained by the word segmentation unit 131, the word segmentation part of speech, and the connection relationship obtained by the relationship analysis unit 132, where the word segmentation text includes the semantic word.
Specifically, the processing unit 133 determines the semantics of the video text information according to the word segmentation text, the word segmentation part of speech and the connection relation, and takes one or more word segmentation texts as the semantic word of the video sample corresponding to the video text information, i.e. the semantic word is the semantics of the extracted video sample.
The semantic slot establishing unit 134 establishes the semantic slot according to the video text information obtained by the sample obtaining module 110 and the semantic word obtained by the processing unit 133, and establishes a corresponding relationship between the video text information and the semantic word in the semantic slot.
Specifically, the semantic slot establishing unit 134 establishes a semantic slot according to the video text information and the semantic word, and establishes a corresponding relationship between the video text information and the semantic word in the semantic slot, so as to facilitate searching according to the corresponding relationship.
And a generating unit 135 configured to generate a corresponding regular expression according to the word segmentation text obtained by the word segmentation unit 131, the word segmentation part of speech, and the connection relationship obtained by the relationship analysis unit 132.
Specifically, the generating unit 135 generates a corresponding regular expression according to the segmented text, the segmented part of speech, and the connection relation, for example, determines whether the corresponding segmented text is replaced by the segmented part of speech in the regular expression according to the segmented part of speech, and determines that the segmented text that is the semantic segmented is still represented by the original segmented text in the regular expression, and the connection relation between the segmented texts is still maintained in the regular expression.
And a database establishing unit 136, configured to establish the regular expression library according to the video text information obtained by the sample obtaining module 110 and the regular expression generated by the generating unit 135, and establish a correspondence between the video text information and the regular expression in the regular expression library.
Specifically, the database establishing unit 136 establishes a regular expression library according to the video text information and the regular expression, and establishes a corresponding relation between the video text information and the regular expression in the regular expression library, so as to facilitate subsequent searching according to the corresponding relation.
The voice obtaining module 140 obtains video voice, and compares the video voice with the voice waveform library established by the waveform library establishing module 120 to obtain video information corresponding to the video voice.
The marking module 150 is configured to match the video information obtained by the voice obtaining module 140 with the semantic slots and the regular expression library established by the database establishing module 130, obtain video content semantics, and mark the video content semantics as video marking information for the video voice.
The marking module 150 specifically includes:
The control unit 151 performs word segmentation on the video information through a word segmentation technology to obtain video word segmentation, and generates a video regular expression according to the video information.
And a matching unit 152, configured to match the video segmentation obtained by the control unit 151 with the semantic segmentation of the semantic slot, and match the video regular expression obtained by the control unit 151 with a regular expression in the regular expression library.
Specifically, the control unit 151 performs word segmentation on the video information through a word segmentation technology, splits the video information into word, sentence and other video word segments, and analyzes the video word segments contained in the video information, the part of speech corresponding to the video word segments, and the connection relationship between the part of speech corresponding to the video word segments, so as to generate a video regular formula corresponding to the video information. The matching unit 152 matches the video segmentation with the semantic segmentation of the semantic slot one by one, and matches the video regular expression with regular expressions in the regular expression library one by one.
And a semantic determining unit 153, configured to use the corresponding semantic word in the semantic slot as the semantic of the video content if the video word obtained by the matching unit 152 matches and conforms to the video regular expression.
Specifically, if a certain semantic word match matches between the video word and the semantic slot, and a certain regular expression in the video regular expression and the regular expression library matches, the semantic determining unit 153 uses the semantic word in the semantic slot matched as the acquired video content semantic corresponding to the video voice.
The semantic determining unit 153 determines the video content semantics of the video information and updates the semantic slots according to the video content semantics if the video regular pattern obtained by the matching unit 152 matches and the video word segmentation matching results do not match.
Specifically, if a video regular expression matches a regular expression in the regular expression library, but all semantic word matches in the video word and the semantic slot do not match, the semantic determination unit 153 analyzes the video information to determine the video content semantics and updates the semantic slot according to the determined video content semantics.
And a marking unit 154 marking the video content semantic determined by the semantic determining unit 153 as video marking information.
Specifically, the marking unit 154 marks the video voice with the determined video content semantics as video marking information, so as to facilitate the subsequent user to find.
The searching module 160 obtains user search information, matches the user search information with the video marking information obtained by the marking module 150, and determines a user target video.
The search module 160 specifically includes:
the acquisition unit 161 acquires user search information.
Specifically, the acquiring unit 161 acquires the user search information, and since the current voice is one of the main ways of man-machine interaction, when the acquired user search information is input by the user, the user search information needs to be converted into the corresponding text information.
And a searching unit 162, configured to, when the user search information acquired by the acquiring unit 161 is a keyword, match the keyword with the video tag information, and determine the user target video.
Specifically, if the user search information input by the user is a keyword, the search unit 162 matches the acquired keyword with the video tag information corresponding to each video voice, and determines the user target video. For example, if the user directly inputs the user search information "trigonometric function", the "trigonometric function" is matched with the video mark information corresponding to each video voice, so as to obtain the video voice with the video mark information of "trigonometric function".
The searching unit 162 matches the key sentence with the regular expression library when the user search information obtained by the obtaining unit 161 is a key sentence, determines a semantic word corresponding to the semantic slot according to a regular expression matched with the key sentence, obtains corresponding video mark information, and determines the user target video.
Specifically, when the user search information is a key sentence, the search unit 162 matches the key sentence with the regular expression library, determines video text information corresponding to a regular expression in the regular expression library matching according to a corresponding relation between video text information and the regular expression in the regular expression library, then determines a corresponding semantic word according to a corresponding relation between video text information and the semantic word in the semantic slot, obtains corresponding video tag information, and determines a user target video.
For example, if the user inputs the user search information "find video voice about trigonometric function", the "find video voice about trigonometric function" is converted into a regular expression and a regular expression library to be matched, the matching matches to obtain the corresponding semantic segmentation as "trigonometric function", and the matching is performed with the video mark information corresponding to each video voice to obtain the video voice with the video mark information as "trigonometric function".
In this embodiment, a video sample is obtained and analyzed, a word segmentation technique is used to segment video text information corresponding to the video sample, and the sentence structure of the video text information is analyzed, so as to establish a semantic slot and a regular expression library. Matching the acquired video voice with the established semantic slots and the regular expression library, so as to determine the video content semantics corresponding to the video voice, and marking the video voice as video marking information.
It should be noted that the above embodiments can be freely combined as needed. The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (10)

1. A method for searching for video of a specific content, comprising:
acquiring a video sample, and generating corresponding video text information and video waveforms according to the video sample;
establishing a voice waveform library according to the video text information and the video waveform;
establishing a semantic slot and a regular expression library according to the video text information;
Obtaining video voice, and comparing the video voice with the voice waveform library to obtain video information corresponding to the video voice;
matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and marking the video voice by taking the video content semantics as video marking information;
and acquiring user search information, matching the user search information with the video mark information, and determining a user target video.
2. The method for searching video of specific content according to claim 1, wherein said creating a semantic slot and a regular expression library according to said video text information specifically comprises:
the video text information is segmented through a segmentation technology to obtain corresponding segmented text and segmented part of speech corresponding to the segmented text;
analyzing sentence structure in the video text information to obtain connection relation between the word segmentation texts;
determining semantic word segmentation of the video sample corresponding to the video text information according to the word segmentation text, the word segmentation part of speech and the connection relation, wherein the word segmentation text contains the semantic word segmentation;
Establishing a semantic slot according to the video text information and the semantic word, and establishing a corresponding relation between the video text information and the semantic word in the semantic slot;
generating a corresponding regular expression according to the word segmentation text, the word segmentation part of speech and the connection relation;
and establishing the regular expression library according to the video text information and the regular expression, and establishing a corresponding relation between the video text information and the regular expression in the regular expression library.
3. The method for searching video of specific content according to claim 2, wherein said matching the video information with the semantic slots and the regular expression library to obtain video content semantics, and marking the video content semantics as video marking information specifically includes:
the video information is segmented through a segmentation technology to obtain video segmentation, and a video regular pattern is generated according to the video information;
matching the video segmentation words with the semantic segmentation words of the semantic slots, and matching the video regular expressions with regular expressions in the regular expression library;
If the video segmentation word and the video regular expression are matched and matched, the corresponding semantic segmentation word in the semantic slot is used as the semantic of the video content;
if the video regular patterns are matched and matched, and the video word segmentation matching results are not matched, determining the video content semantics of the video information, and updating the semantic slots according to the video content semantics;
and marking the video voice by taking the video content semantic as video marking information.
4. The method for searching for videos of specific contents according to claim 1, wherein the obtaining user search information, matching the user search information with the video tag information, and determining the user target video specifically comprises:
acquiring user search information;
and when the user searching information is a keyword, matching the keyword with the video marking information, and determining the user target video.
5. The method for searching for video of a specific content according to claim 4, wherein said obtaining user search information, matching said user search information with said video tag information, determining a user target video further comprises:
When the user search information is a key sentence, matching the key sentence with the regular expression library, determining semantic segmentation corresponding to the semantic slot according to a matched regular expression, obtaining corresponding video mark information, and determining the user target video.
6. A search system for video of a specific content, comprising:
the sample acquisition module acquires a video sample and generates corresponding video text information and video waveforms according to the video sample;
the waveform library establishing module is used for establishing a voice waveform library according to the video text information and the video waveform obtained by the sample obtaining module;
the database establishing module is used for establishing a semantic slot and a regular expression library according to the video text information obtained by the sample obtaining module;
the voice acquisition module acquires video voice, compares the video voice with the voice waveform library established by the waveform library establishment module, and obtains video information corresponding to the video voice;
the marking module is used for matching the video information obtained by the voice obtaining module with the semantic slots and the regular expression library established by the database establishing module to obtain video content semantics, and marking the video voice by taking the video content semantics as video marking information;
And the searching module is used for acquiring user searching information, matching the user searching information with the video marking information obtained by the marking module and determining a user target video.
7. The system for searching for videos of a specific content according to claim 6, wherein the database creation module specifically comprises:
the word segmentation unit is used for segmenting the video text information obtained by the sample obtaining module through a word segmentation technology to obtain a corresponding word segmentation text and a word segmentation part of speech corresponding to the word segmentation text;
the relation analysis unit is used for analyzing sentence structures in the video text information obtained by the sample acquisition module to obtain connection relations among the word segmentation texts;
the processing unit is used for determining semantic word segmentation of the video sample corresponding to the video text information according to the word segmentation text obtained by the word segmentation unit, the word segmentation part of speech and the connection relation obtained by the relation analysis unit, wherein the word segmentation text comprises the semantic word segmentation;
the semantic slot establishing unit establishes the semantic slot according to the video text information obtained by the sample obtaining module and the semantic word obtained by the processing unit, and establishes the corresponding relation between the video text information and the semantic word in the semantic slot;
The generation unit is used for generating a corresponding regular expression according to the word segmentation text, the word segmentation part of speech and the connection relation obtained by the relation analysis unit;
the database establishing unit establishes the regular expression library according to the video text information obtained by the sample obtaining module and the regular expression generated by the generating unit, and establishes the corresponding relation between the video text information and the regular expression in the regular expression library.
8. The system for searching for videos of a specific content according to claim 7, wherein the marking module specifically comprises:
the control unit is used for segmenting the video information through a word segmentation technology to obtain video segmentation, and generating a video regular pattern according to the video information;
the matching unit is used for matching the video segmentation obtained by the control unit with the semantic segmentation of the semantic slot and matching the video regular expression obtained by the control unit with the regular expression in the regular expression library;
the semantic determining unit is used for taking the corresponding semantic segmentation word in the semantic slot as the semantic of the video content if the video segmentation word obtained by the matching unit is matched with the video regular expression;
The semantic determining unit is used for determining the video content semantics of the video information and updating the semantic slots according to the video content semantics if the video regular pattern matching obtained by the matching unit is consistent and the video word segmentation matching result is not consistent;
and the marking unit marks the video voice by taking the video content semantics determined by the semantics determining unit as video marking information.
9. The system for searching for video of a specific content according to claim 6, wherein the searching module specifically comprises:
an acquisition unit that acquires user search information;
and the searching unit is used for matching the keyword with the video mark information when the user search information acquired by the acquiring unit is the keyword, and determining the user target video.
10. The content-specific video search system of claim 9, wherein the lookup module further comprises:
and the searching unit is used for matching the key sentences with the regular expression library when the user searching information is the key sentences, determining the semantic segmentation corresponding to the semantic slots according to the matched regular expressions to obtain corresponding video mark information and determining the user target video.
CN201910047102.1A 2019-01-18 2019-01-18 Method and system for searching video of specific content Active CN109783821B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910047102.1A CN109783821B (en) 2019-01-18 2019-01-18 Method and system for searching video of specific content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910047102.1A CN109783821B (en) 2019-01-18 2019-01-18 Method and system for searching video of specific content

Publications (2)

Publication Number Publication Date
CN109783821A CN109783821A (en) 2019-05-21
CN109783821B true CN109783821B (en) 2023-06-27

Family

ID=66501664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910047102.1A Active CN109783821B (en) 2019-01-18 2019-01-18 Method and system for searching video of specific content

Country Status (1)

Country Link
CN (1) CN109783821B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559798B (en) * 2019-09-26 2022-05-17 北京新唐思创教育科技有限公司 Method and device for detecting quality of audio content
CN112989120B (en) * 2021-05-13 2021-08-03 广东众聚人工智能科技有限公司 Video clip query system and video clip query method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105723450A (en) * 2013-11-13 2016-06-29 谷歌公司 Envelope comparison for utterance detection
CN109196495A (en) * 2016-03-23 2019-01-11 亚马逊技术公司 Fine granularity natural language understanding

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8347247B2 (en) * 2008-10-17 2013-01-01 International Business Machines Corporation Visualization interface of continuous waveform multi-speaker identification
CN103778204A (en) * 2014-01-13 2014-05-07 北京奇虎科技有限公司 Voice analysis-based video search method, equipment and system
CN106326303B (en) * 2015-06-30 2019-09-13 芋头科技(杭州)有限公司 A kind of spoken semantic analysis system and method
CN105138575B (en) * 2015-07-29 2017-09-05 百度在线网络技术(北京)有限公司 The analysis method and device of speech text string
CN105512105B (en) * 2015-12-07 2019-05-31 百度在线网络技术(北京)有限公司 Semantic analysis method and device
CN105786793B (en) * 2015-12-23 2019-05-28 百度在线网络技术(北京)有限公司 Parse the semantic method and apparatus of spoken language text information
CN107071542B (en) * 2017-04-18 2020-07-28 百度在线网络技术(北京)有限公司 Video clip playing method and device
CN107315737B (en) * 2017-07-04 2021-03-23 北京奇艺世纪科技有限公司 Semantic logic processing method and system
CN108182229B (en) * 2017-12-27 2022-10-28 上海科大讯飞信息科技有限公司 Information interaction method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105723450A (en) * 2013-11-13 2016-06-29 谷歌公司 Envelope comparison for utterance detection
CN109196495A (en) * 2016-03-23 2019-01-11 亚马逊技术公司 Fine granularity natural language understanding

Also Published As

Publication number Publication date
CN109783821A (en) 2019-05-21

Similar Documents

Publication Publication Date Title
CN107315737B (en) Semantic logic processing method and system
US10169703B2 (en) System and method for analogy detection and analysis in a natural language question and answering system
JP5167546B2 (en) Sentence search method, sentence search device, computer program, recording medium, and document storage device
CN108549628B (en) Sentence-breaking device and method for stream type natural language information
US10929613B2 (en) Automated document cluster merging for topic-based digital assistant interpretation
KR20110038474A (en) Apparatus and method for detecting sentence boundaries
CN111488468B (en) Geographic information knowledge point extraction method and device, storage medium and computer equipment
JP2021197133A (en) Meaning matching method, device, electronic apparatus, storage medium, and computer program
JPWO2008016102A1 (en) Similarity calculation device and information retrieval device
CN112417102A (en) Voice query method, device, server and readable storage medium
CN110119510B (en) Relationship extraction method and device based on transfer dependency relationship and structure auxiliary word
CN111508502B (en) Alternative method and system for displaying results
WO2019133856A2 (en) Automated discourse phrase discovery for generating an improved language model of a digital assistant
CN110782892B (en) Voice text error correction method
JP2007087397A (en) Morphological analysis program, correction program, morphological analyzer, correcting device, morphological analysis method, and correcting method
CN109783821B (en) Method and system for searching video of specific content
CN113343108B (en) Recommended information processing method, device, equipment and storage medium
CN111292751A (en) Semantic analysis method and device, voice interaction method and device, and electronic equipment
CN115544303A (en) Method, apparatus, device and medium for determining label of video
CN113821593A (en) Corpus processing method, related device and equipment
CN115759071A (en) Government affair sensitive information identification system and method based on big data
CN114996506A (en) Corpus generation method and device, electronic equipment and computer-readable storage medium
CN113658594A (en) Lyric recognition method, device, equipment, storage medium and product
CN109800430B (en) Semantic understanding method and system
CN109766551B (en) Method and system for determining ambiguous word semantics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant