CN108874815A - The search method and device of audio-video - Google Patents
The search method and device of audio-video Download PDFInfo
- Publication number
- CN108874815A CN108874815A CN201710328019.2A CN201710328019A CN108874815A CN 108874815 A CN108874815 A CN 108874815A CN 201710328019 A CN201710328019 A CN 201710328019A CN 108874815 A CN108874815 A CN 108874815A
- Authority
- CN
- China
- Prior art keywords
- video
- audio
- sentence
- retrieval
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 230000005540 biological transmission Effects 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 abstract description 10
- 238000004590 computer program Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 3
- 238000005194 fractionation Methods 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 238000007689 inspection Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/54—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of search method of audio-video and devices.This method includes:Obtain retrieval sentence, wherein retrieval sentence is for retrieving target audio-video;The multiple sentences for retrieving sentence with index is directed toward are matched, are obtained and the retrieval successful sentence of statement matching, wherein multiple sentences are the corresponding multiple sentences of each audio-video in audio-video set;Return to audio-video corresponding with the retrieval successful sentence of statement matching;And target audio-video is determined in corresponding audio-video.By the application, solve the problems, such as lower to the accuracy of audio-video retrieval in the related technology.
Description
Technical field
This application involves audio and video technology process fields, in particular to the search method and device of a kind of audio-video.
Background technique
In order to be retrieved to audio, video file, in this way especially for telephonograph, interview recording, interview video
Audio, video file.In general, then being retrieved to title by carrying out full-text index to audio, video files names.
However the program can only retrieve audio title, video name, can not know audio, the content of video, find out in this way
File may be not intended to.Brief introduction is increased to audio, video file in the related technology, then title and brief introduction are carried out
Full-text search.Although corresponding audio, video file can be found according to brief introduction, the brief introduction of audio, video file is arranged
It is also required to a large amount of cost of labor, if the content of audio-video brief introduction and audio-video does not correspond to, retrieval accuracy will be substantially reduced.
For problem lower to the accuracy of audio-video retrieval in the related technology, effective solution side is not yet proposed at present
Case.
Summary of the invention
The main purpose of the application is to provide the search method and device of a kind of audio-video, right in the related technology to solve
The lower problem of the accuracy of audio-video retrieval.
To achieve the goals above, according to the one aspect of the application, a kind of search method of audio-video is provided.The party
Method includes:Obtain retrieval sentence, wherein the retrieval sentence is for retrieving target audio-video;By the retrieval sentence
The multiple sentences being directed toward with index are matched, and are obtained and the successful sentence of the retrieval statement matching, wherein the multiple language
Sentence is the corresponding multiple sentences of each audio-video in audio-video set;It returns and the successful sentence pair of the retrieval statement matching
The audio-video answered;And the target audio-video is determined in the corresponding audio-video.
Further, before being matched the retrieval sentence with multiple sentences that index is directed toward, the method is also
Including:Each audio-video in the audio-video set is converted into corresponding text respectively;According to the first preset condition to every
The corresponding text of a audio-video is split, and the corresponding multiple sentences of each audio-video are obtained;Determine each sentence corresponding
Audio-video in initial position and end position.
Further, after determining the target audio-video in the corresponding audio-video, the method also includes:Root
Initial position and end position according to each sentence in the target audio-video, determine object statement in the target audio-video
Corresponding initial position and end position, wherein object statement be the target audio-video in the retrieval statement matching at
The sentence of function;By the object statement, corresponding initial position and end position are sent to audio-video in the target audio-video
Player, wherein receive corresponding initial position and end position in the target audio-video in the audio/video player
Later, the initial position that the audio/video player jumps in the target audio-video starts to play.
Further, it is determined that initial position and end position of each sentence in corresponding audio-video include:According to
Each audio-video is divided into the audio-video segment of preset length by the second preset condition;Each audio-video segment is converted into correspondence
Text information;By the corresponding multiple sentences of each audio-video in sequence relationship one by one with current audio-video middle pitch video clip
Corresponding text information is matched, and determines initial position and end position of each sentence in current audio-video.
Further, before being matched according to the retrieval sentence with multiple sentences that index is directed toward, the method
Further include:Multiple sentences corresponding to each audio-video are based on target information and carry out creation index, wherein the target information packet
It includes:Sentence content, the sentence initial position and end position in corresponding audio-video, corresponding audio-video name
Claim.
To achieve the goals above, according to the another aspect of the application, a kind of retrieval device of audio-video is provided.The dress
Set including:First acquisition unit, for obtaining retrieval sentence, wherein the retrieval sentence is for examining target audio-video
Rope;Matching unit obtains and the retrieval sentence for matching the retrieval sentence with multiple sentences that index is directed toward
The sentence of successful match, wherein the multiple sentence is the corresponding multiple sentences of each audio-video in audio-video set;Second
Acquiring unit, for returning to audio-video corresponding with the retrieval successful sentence of statement matching;And first determination unit, it uses
In the target audio-video determining in the corresponding audio-video.
Further, which includes:Converting unit, in the multiple sentences for being directed toward the retrieval sentence and index
Before being matched, each audio-video in the audio-video set is converted into corresponding text respectively;Split cells is used for
The corresponding text of each audio-video is split according to the first preset condition, obtains the corresponding multiple sentences of each audio-video;
Second determination unit, for determining initial position and end position of each sentence in corresponding audio-video.
Further, described device further includes:Third determination unit, described in being determined in the corresponding audio-video
After target audio-video, according to the initial position of each sentence and end position in the target audio-video, object statement is determined
Corresponding initial position and end position in the target audio-video, wherein object statement be the target audio-video in
The successful sentence of the retrieval statement matching;Transmission unit, for the object statement is corresponding in the target audio-video
Initial position and end position be sent to audio/video player, wherein receive the target in the audio/video player
In audio-video after corresponding initial position and end position, the audio/video player is jumped in the target audio-video
Initial position starts to play.
To achieve the goals above, according to the another aspect of the application, a kind of storage medium, the storage medium are provided
Program including storage, wherein described program executes the search method of audio-video described in above-mentioned any one.
To achieve the goals above, according to the another aspect of the application, a kind of processor is provided, the processor is used for
Run program, wherein described program executes the search method of audio-video described in above-mentioned any one when running.
By the application, using following steps:Obtain retrieval sentence, wherein retrieval sentence be used for target audio-video into
Row retrieval;The multiple sentences for retrieving sentence with index is directed toward are matched, obtain and retrieve the successful sentence of statement matching,
In, multiple sentences are the corresponding multiple sentences of each audio-video in audio-video set;It returns successful with retrieval statement matching
The corresponding audio-video of sentence;And target audio-video is determined in corresponding audio-video.It solves in the related technology to audio-video
The lower problem of the accuracy of retrieval.By will retrieve sentence with index direction multiple sentences match, then with inspection
Target audio-video is determined in the corresponding audio-video of the successful sentence of rope statement matching, and then has reached what audio-video was retrieved in promotion
The effect of accuracy.
Detailed description of the invention
The attached drawing constituted part of this application is used to provide further understanding of the present application, the schematic reality of the application
Example and its explanation are applied for explaining the application, is not constituted an undue limitation on the present application.In the accompanying drawings:
Fig. 1 is the flow chart according to the search method of audio-video provided by the embodiments of the present application;And
Fig. 2 is the schematic diagram according to the retrieval device of audio-video provided by the embodiments of the present application.
Specific embodiment
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application
Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only
The embodiment of the application a part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people
Member's every other embodiment obtained without making creative work, all should belong to the model of the application protection
It encloses.
It should be noted that the description and claims of this application and term " first " in above-mentioned attached drawing, "
Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way
Data be interchangeable under appropriate circumstances, so as to embodiments herein described herein.In addition, term " includes " and " tool
Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a series of steps or units
Process, method, system, product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include without clear
Other step or units listing to Chu or intrinsic for these process, methods, product or equipment.
According to an embodiment of the present application, a kind of search method of audio-video is provided.
Fig. 1 is the flow chart according to the search method of the audio-video of the embodiment of the present application.As shown in Figure 1, this method includes
Following steps:
Step S101 obtains retrieval sentence, wherein retrieval sentence is for retrieving target audio-video.
It is required to look up to search user in multiple audios such as telephonograph, interview recording, interview video, video file
Target audio-video, receive user input retrieval sentence, based on retrieval sentence target audio-video is retrieved.Alternatively, connecing
The voice for receiving user's input converts the speech into text by speech recognition technology, as retrieval sentence.
Step S102, by retrieve sentence with index be directed toward multiple sentences match, obtain with retrieve statement matching at
The sentence of function, wherein multiple sentences are the corresponding multiple sentences of each audio-video in audio-video set.
By matching the multiple sentences for retrieving sentence with index is directed toward, if retrieving statement matching to corresponding language
Sentence, namely obtain and the retrieval successful sentence of statement matching.Multiple sentences are each sound in audio-video set in this application
The corresponding multiple sentences of video.In the search method of audio-video provided by the embodiments of the present application, sentence and index will retrieved
Before the multiple sentences being directed toward are matched, this method further includes:Each audio-video in audio-video set is converted to respectively
Corresponding text;The corresponding text of each audio-video is split according to the first preset condition, it is corresponding to obtain each audio-video
Multiple sentences;Determine initial position and end position of each sentence in corresponding audio-video.
It should be noted that the index (for example, inverted index) in the search engine that index mentioned in this application refers to,
The search engine used in this application can be ElasticSearch or Solr etc..
Specifically, in this application using within the scope of user search all multiple audios, video file is as audio-video collection
It closes, each audio-video in audio-video set is converted into corresponding text, for example, by the telephonograph 1 in audio-video set
Be converted to corresponding text;Telephonograph 2 in audio-video set is converted into corresponding text etc..
For example, (such as according to specific punctuation mark:Fullstop, question mark, branch etc.) (corresponding the first above-mentioned preset condition)
Text corresponding after conversion is split, the corresponding multiple sentences of each audio-video are obtained after fractionation, is completed after fractionation each
The corresponding multiple sentences of audio-video can store in tables of data, each in tables of data is a sentence.
Optionally, in the search method of audio-video provided by the embodiments of the present application, determine each sentence corresponding
Initial position and end position in audio-video include:Each audio-video is divided into preset length according to the second preset condition
Audio-video segment;Each audio-video segment is converted into corresponding text information;The corresponding multiple sentences of each audio-video are pressed
According to ordinal relation, text information corresponding with current audio-video middle pitch video clip is matched one by one, determines that each sentence is being worked as
Initial position and end position in preceding audio-video.
For example, being completed after the corresponding multiple sentences of each audio-video are stored in tables of data after to fractionation, obtaining should
First sentence item1 in tables of data, starting position are labeled as 0.It is intercepted since the starting position in audio-video one long
The segment s1 for being L, such as L=5 seconds are spent, then the starting position of s1 is 0, end position 5s, then by the audio-video segment
It is converted into text t1, the text t1 after conversion is matched with item1.If item1 includes t1, t1 is the clause of item1,
And t1 and item1 can be exactly matched, then illustrating that the starting position of item1 is 0, end position is 5, can thus be located
Manage next sentence item2.
If t1 is the several words of beginning of item1, such as t1=my today, then need to cut from s1 end position backward
Taking length is second audio fragment s2 of L, and audio fragment s2 is then converted into text t2;If t2 is the ending of item1
Word string, then the end position of item1 is the end position of s2, as 10s.Next language can be matched after the completion of this matching
Sentence item2.
If a part of t2 is the word string of item1, if t2=work will make my needs, then t2 is equivalent to across two
Words reduce the length of s2 segment, then carry out the matching of sub-piece.Such as the sub-piece that a length is 2s is intercepted from s2,
It then is just " work will be done " after converting, then the end position of item1 should be 5+2=7s;Next language is carried out in this way
Sentence item2 matching when, audio fragment from 7s intercept backward.
If t2 does not reach the ending of item1, continue to intercept s3 sound bite, successively iteration.It has handled in data
After one sentence, next statement is handled, is completed until all sentences are handled, the tables of data eventually formed, such as the following table 1:
Table 1
Sentence | Starting position | End position |
I has many work to do today. | 0 | 6 |
I needs till all hours of working overtime, and cannot come off duty on time. | 6 | 15 |
Tomorrow is can to rest at weekend two days. | 15 | 24 |
Optionally, in the search method of audio-video provided by the embodiments of the present application, referred to according to retrieval sentence with index
To multiple sentences matched before, this method further includes:Multiple sentences corresponding to each audio-video are based on target information
Carry out creation index, wherein target information includes:The initial position and knot of sentence content, sentence in corresponding audio-video
The title of beam position, corresponding audio-video.
Obtain include multiple sentences, the starting position of each sentence and end position tables of data after, be based on sentence
Content, sentence initial position and end position in corresponding audio-video, corresponding audio-video title to each sound
The corresponding multiple sentences of video are based on target information and carry out creation index.After to tables of data creation index, in retrieval language
Sentence is matched with multiple sentences that index is directed toward, after successful match, the content of the available sentence to successful match, language
Initial position and end position, the corresponding title of audio-video etc. of the sentence in corresponding audio-video.
It should be noted that above-mentioned target information can also include the storage address of audio-video etc. information.In this Shen
It please be to have been carried out to the specifying information of target information for example, but being not limited to this.
Step S103 returns to audio-video corresponding with the retrieval successful sentence of statement matching.
Step S104 determines target audio-video in corresponding audio-video.
For example, it is as follows to return to audio-video corresponding with the retrieval successful sentence of statement matching:Telephonograph 1, interview recording
1, video 3, interview video 5 etc. are interviewed.The target audio-video that user requires to look up is determined in multiple corresponding audio-videos.
By matching the multiple sentences for retrieving sentence with index is directed toward, then successful with retrieval statement matching
Target audio-video is determined in the corresponding audio-video of sentence, promotes the accuracy retrieved to audio-video.
Optionally, in the search method of audio-video provided by the embodiments of the present application, mesh is determined in corresponding audio-video
After mark with phonetic symbols video, this method further includes:According to the initial position of sentence each in target audio-video and end position, mesh is determined
Poster sentence corresponding initial position and end position in target audio-video, wherein object statement be target audio-video in inspection
The successful sentence of rope statement matching;By object statement, corresponding initial position and end position are sent to sound in target audio-video
Video player, wherein it is received in target audio-video after corresponding initial position and end position in audio/video player,
The initial position that audio/video player jumps in target audio-video starts to play.
For example, target audio-video be interview video 3, interview video 3 in retrieval the successful sentence of statement matching be " I
Till all hours of working overtime are needed, cannot be come off duty on time ", determine " I needs till all hours of working overtime, and cannot come off duty on time " in interview video 3
Initial position be 6 and end position 15, by interview video 3 in object statement initial position be 6 and end position 15 send
To audio/video player.Audio/video player jumps to initial position 6 in interview video 3 and carries out starting to play.
Through the above scheme, it by the way that user is inputted one section of voice or one section of text, can be looked into multiple audio-video documents
Corresponding audio-video is found, target audio-video is determined from corresponding audio-video, and audio/video player may be implemented to jump
It goes to starting position of the retrieval sentence in target audio-video to play out, be promoted to audio-video effectiveness of retrieval, also promotion pair
The accuracy of audio-video retrieval and user experience.
In conclusion the search method of audio-video provided by the embodiments of the present application, retrieves sentence by obtaining, wherein inspection
Rope sentence is for retrieving target audio-video;By retrieve sentence with index be directed toward multiple sentences match, obtain and
Retrieve the successful sentence of statement matching, wherein multiple sentences are the corresponding multiple sentences of each audio-video in audio-video set;
Return to audio-video corresponding with the retrieval successful sentence of statement matching;And target audio-video is determined in corresponding audio-video.
It solves the problems, such as lower to the accuracy of audio-video retrieval in the related technology.By that will retrieve sentence and index the multiple of direction
Sentence is matched, and then determines target audio-video in audio-video corresponding with the retrieval successful sentence of statement matching, in turn
Achieve the effect that promote the accuracy for retrieving audio-video.
It should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions
It is executed in computer system, although also, logical order is shown in flow charts, and it in some cases, can be with not
The sequence being same as herein executes shown or described step.
The embodiment of the present application also provides a kind of retrieval devices of audio-video, it should be noted that the embodiment of the present application
The retrieval device of audio-video can be used for executing the search method that audio-video is used for provided by the embodiment of the present application.Below to this
The retrieval device for the audio-video that application embodiment provides is introduced.
Fig. 2 is the schematic diagram according to the retrieval device of the audio-video of the embodiment of the present application.As shown in Fig. 2, the device includes:
First acquisition unit 10, matching unit 20, second acquisition unit 30 and the first determination unit 40.
Specifically, first acquisition unit 10, for obtaining retrieval sentence, wherein retrieval sentence is used for target audio-video
It is retrieved.
Matching unit 20 obtains and retrieval sentence for matching the multiple sentences for retrieving sentence with index is directed toward
The sentence of successful match, wherein multiple sentences are the corresponding multiple sentences of each audio-video in audio-video set.
Second acquisition unit 30, for returning to audio-video corresponding with the successful sentence of statement matching is retrieved.
First determination unit 40, for determining target audio-video in corresponding audio-video.
The retrieval device of audio-video provided by the embodiments of the present application obtains retrieval sentence by first acquisition unit 10,
In, retrieval sentence is for retrieving target audio-video;Matching unit 20 will retrieve multiple sentences of sentence and index direction
It is matched, is obtained and the retrieval successful sentence of statement matching, wherein multiple sentences are each audio-video in audio-video set
Corresponding multiple sentences.Second acquisition unit 30 returns to audio-video corresponding with the retrieval successful sentence of statement matching.First really
Order member 40 in corresponding audio-video determine target audio-video, solve in the related technology to audio-video retrieval accuracy compared with
Low problem, by matching the multiple sentences for retrieving sentence with index is directed toward, then successful with retrieval statement matching
The corresponding audio-video of sentence in determine target audio-video, and then achieved the effect that promote the accuracy of retrieving audio-video.
Optionally, in the retrieval device of audio-video provided by the embodiments of the present application, which includes:Converting unit is used
In will retrieve sentence with index direction multiple sentences matched before, respectively by each audio-video in audio-video set
Be converted to corresponding text;Split cells, for being split according to the first preset condition to the corresponding text of each audio-video,
Obtain the corresponding multiple sentences of each audio-video;Second determination unit, for determining each sentence in corresponding audio-video
Initial position and end position.
Optionally, in the retrieval device of audio-video provided by the embodiments of the present application, which further includes:Third determines single
Member, after determining target audio-video in corresponding audio-video, according to the initial position of sentence each in target audio-video
And end position, determine object statement corresponding initial position and end position in target audio-video, wherein object statement is
In target audio-video with retrieval the successful sentence of statement matching;Transmission unit, for object statement is right in target audio-video
The initial position and end position answered are sent to audio/video player, wherein receive target audio-video in audio/video player
In after corresponding initial position and end position, the initial position that audio/video player jumps in target audio-video starts to broadcast
It puts.
The retrieval device of the audio-video includes processor and memory, above-mentioned first acquisition unit 10, matching unit 20,
Second acquisition unit 30 and the first determination unit 40 etc. store in memory as program unit, execute storage by processor
Above procedure unit in memory realizes corresponding function.
Include kernel in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can be set one
Or more, carry out retrieving audio/video frequency by adjusting kernel parameter.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/
Or the forms such as Nonvolatile memory, if read-only memory (ROM) or flash memory (flash RAM), memory include that at least one is deposited
Store up chip.
The embodiment of the invention provides a kind of storage mediums, are stored thereon with program, real when which is executed by processor
The search method of the existing audio-video.
The embodiment of the invention provides a kind of processor, the processor is for running program, wherein described program operation
The search method of audio-video described in Shi Zhihang.
The embodiment of the invention provides a kind of equipment, equipment include processor, memory and storage on a memory and can
The program run on a processor, processor realize following steps when executing program:Obtain retrieval sentence, wherein the retrieval
Sentence is for retrieving target audio-video;The retrieval sentence is matched with multiple sentences that index is directed toward, is obtained
With the successful sentence of the retrieval statement matching, wherein the multiple sentence is that each audio-video in audio-video set is corresponding
Multiple sentences;Return to audio-video corresponding with the retrieval successful sentence of statement matching;And it is regarded in the corresponding sound
The target audio-video is determined in frequency.
Before the retrieval sentence is matched with multiple sentences that index is directed toward, the method also includes:Respectively
Each audio-video in the audio-video set is converted into corresponding text;According to the first preset condition to each audio-video pair
The text answered is split, and the corresponding multiple sentences of each audio-video are obtained;Determine each sentence in corresponding audio-video
Initial position and end position.
After determining the target audio-video in the corresponding audio-video, the method also includes:According to the mesh
The initial position of each sentence and end position in mark with phonetic symbols video determine object statement corresponding in the target audio-video
Beginning position and end position, wherein object statement be in the target audio-video with the successful sentence of the retrieval statement matching;
By the object statement, corresponding initial position and end position are sent to audio/video player in the target audio-video,
In, it is received in the target audio-video after corresponding initial position and end position in the audio/video player, it is described
The initial position that audio/video player jumps in the target audio-video starts to play.
Determine that initial position and end position of each sentence in corresponding audio-video include:According to the second default item
Each audio-video is divided into the audio-video segment of preset length by part;Each audio-video segment is converted into corresponding text letter
Breath;By the corresponding multiple sentences of each audio-video relationship text corresponding with current audio-video middle pitch video clip one by one in sequence
Word information is matched, and determines initial position and end position of each sentence in current audio-video.
Before being matched according to the retrieval sentence with multiple sentences that index is directed toward, the method also includes:It is right
The corresponding multiple sentences of each audio-video are based on target information and carry out creation index, wherein the target information includes:In sentence
Hold, initial position and end position of the sentence in corresponding audio-video, corresponding audio-video title.Herein
Equipment can be server, PC, PAD, mobile phone etc..
Present invention also provides a kind of computer program products, when executing on data processing equipment, are adapted for carrying out just
The program of beginningization there are as below methods step:Obtain retrieval sentence, wherein the retrieval sentence is for examining target audio-video
Rope;The retrieval sentence is matched with multiple sentences that index is directed toward, is obtained and the successful language of the retrieval statement matching
Sentence, wherein the multiple sentence is the corresponding multiple sentences of each audio-video in audio-video set;It returns and the retrieval language
The corresponding audio-video of sentence of sentence successful match;And the target audio-video is determined in the corresponding audio-video.
Before the retrieval sentence is matched with multiple sentences that index is directed toward, the method also includes:Respectively
Each audio-video in the audio-video set is converted into corresponding text;According to the first preset condition to each audio-video pair
The text answered is split, and the corresponding multiple sentences of each audio-video are obtained;Determine each sentence in corresponding audio-video
Initial position and end position.
After determining the target audio-video in the corresponding audio-video, the method also includes:According to the mesh
The initial position of each sentence and end position in mark with phonetic symbols video determine object statement corresponding in the target audio-video
Beginning position and end position, wherein object statement be in the target audio-video with the successful sentence of the retrieval statement matching;
By the object statement, corresponding initial position and end position are sent to audio/video player in the target audio-video,
In, it is received in the target audio-video after corresponding initial position and end position in the audio/video player, it is described
The initial position that audio/video player jumps in the target audio-video starts to play.
Determine that initial position and end position of each sentence in corresponding audio-video include:According to the second default item
Each audio-video is divided into the audio-video segment of preset length by part;Each audio-video segment is converted into corresponding text letter
Breath;By the corresponding multiple sentences of each audio-video relationship text corresponding with current audio-video middle pitch video clip one by one in sequence
Word information is matched, and determines initial position and end position of each sentence in current audio-video.
Before being matched according to the retrieval sentence with multiple sentences that index is directed toward, the method also includes:It is right
The corresponding multiple sentences of each audio-video are based on target information and carry out creation index, wherein the target information includes:In sentence
Hold, initial position and end position of the sentence in corresponding audio-video, corresponding audio-video title.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/
Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable Jie
The example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices
Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates
Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including element
There is also other identical elements in process, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can provide as method, system or computer program product.
Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application
Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code
The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Formula.
The above is only embodiments herein, are not intended to limit this application.To those skilled in the art,
Various changes and changes are possible in this application.It is all within the spirit and principles of the present application made by any modification, equivalent replacement,
Improve etc., it should be included within the scope of the claims of this application.
Claims (10)
1. a kind of search method of audio-video, which is characterized in that including:
Obtain retrieval sentence, wherein the retrieval sentence is for retrieving target audio-video;
The retrieval sentence is matched with multiple sentences that index is directed toward, is obtained and the successful language of the retrieval statement matching
Sentence, wherein the multiple sentence is the corresponding multiple sentences of each audio-video in audio-video set;
Return to audio-video corresponding with the retrieval successful sentence of statement matching;And
The target audio-video is determined in the corresponding audio-video.
2. the method according to claim 1, wherein in the multiple sentences for being directed toward the retrieval sentence and index
Before being matched, the method also includes:
Each audio-video in the audio-video set is converted into corresponding text respectively;
The corresponding text of each audio-video is split according to the first preset condition, obtains the corresponding multiple languages of each audio-video
Sentence;
Determine initial position and end position of each sentence in corresponding audio-video.
3. according to the method described in claim 2, it is characterized in that, determining the target sound view in the corresponding audio-video
After frequency, the method also includes:
According to the initial position of each sentence and end position in the target audio-video, determine object statement in the target sound
Corresponding initial position and end position in video, wherein object statement be the target audio-video in the retrieval sentence
The sentence of successful match;
By the object statement, corresponding initial position and end position are sent to audio and video playing in the target audio-video
Device, wherein it is received in the target audio-video after corresponding initial position and end position in the audio/video player,
The initial position that the audio/video player jumps in the target audio-video starts to play.
4. according to the method described in claim 2, it is characterized in that, determining starting of each sentence in corresponding audio-video
Position and end position include:
Each audio-video is divided into the audio-video segment of preset length according to the second preset condition;
Each audio-video segment is converted into corresponding text information;
By the corresponding multiple sentences of each audio-video, relationship is corresponding with current audio-video middle pitch video clip one by one in sequence
Text information is matched, and determines initial position and end position of each sentence in current audio-video.
5. according to the method described in claim 2, it is characterized in that, in the multiple languages being directed toward according to the retrieval sentence and index
Before sentence is matched, the method also includes:
Multiple sentences corresponding to each audio-video are based on target information and carry out creation index, wherein the target information includes:
Sentence content, the sentence initial position and end position in corresponding audio-video, corresponding audio-video title.
6. a kind of retrieval device of audio-video, which is characterized in that including:
First acquisition unit, for obtaining retrieval sentence, wherein the retrieval sentence is for retrieving target audio-video;
Matching unit obtains and the retrieval language for matching the retrieval sentence with multiple sentences that index is directed toward
The sentence of sentence successful match, wherein the multiple sentence is the corresponding multiple sentences of each audio-video in audio-video set;
Second acquisition unit, for returning to audio-video corresponding with the retrieval successful sentence of statement matching;And
First determination unit, for determining the target audio-video in the corresponding audio-video.
7. device according to claim 6, which is characterized in that described device further includes:
Converting unit, for by it is described retrieval sentence with index be directed toward multiple sentences matched before, respectively will described in
Each audio-video in audio-video set is converted to corresponding text;
Split cells obtains each sound view for splitting according to the first preset condition to the corresponding text of each audio-video
Frequently corresponding multiple sentences;
Second determination unit, for determining initial position and end position of each sentence in corresponding audio-video.
8. device according to claim 7, which is characterized in that described device further includes:
Third determination unit, after determining the target audio-video in the corresponding audio-video, according to the target
The initial position of each sentence and end position in audio-video determine object statement corresponding starting in the target audio-video
Position and end position, wherein object statement be the target audio-video in the successful sentence of the retrieval statement matching;
Transmission unit, for corresponding initial position and end position to be sent in the target audio-video by the object statement
To audio/video player, wherein the audio/video player receive in the target audio-video corresponding initial position and
After end position, the initial position that the audio/video player jumps in the target audio-video starts to play.
9. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein described program right of execution
Benefit require any one of 1 to 5 described in audio-video search method.
10. a kind of processor, which is characterized in that the processor is for running program, wherein right of execution when described program is run
Benefit require any one of 1 to 5 described in audio-video search method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710328019.2A CN108874815A (en) | 2017-05-10 | 2017-05-10 | The search method and device of audio-video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710328019.2A CN108874815A (en) | 2017-05-10 | 2017-05-10 | The search method and device of audio-video |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108874815A true CN108874815A (en) | 2018-11-23 |
Family
ID=64319325
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710328019.2A Pending CN108874815A (en) | 2017-05-10 | 2017-05-10 | The search method and device of audio-video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108874815A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110275979A (en) * | 2019-07-01 | 2019-09-24 | 成都启英泰伦科技有限公司 | A kind of mapping management process of voice data and text data |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070087312A1 (en) * | 2005-10-18 | 2007-04-19 | Cheertek Inc. | Method for separating sentences in audio-video display system |
CN102650993A (en) * | 2011-02-25 | 2012-08-29 | 北大方正集团有限公司 | Index establishing and searching methods, devices and systems for audio-video file |
CN104078044A (en) * | 2014-07-02 | 2014-10-01 | 深圳市中兴移动通信有限公司 | Mobile terminal and sound recording search method and device of mobile terminal |
CN105045828A (en) * | 2015-06-26 | 2015-11-11 | 徐信 | Retrieval system and method for accurate positioning of audio/video speech information |
-
2017
- 2017-05-10 CN CN201710328019.2A patent/CN108874815A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070087312A1 (en) * | 2005-10-18 | 2007-04-19 | Cheertek Inc. | Method for separating sentences in audio-video display system |
CN102650993A (en) * | 2011-02-25 | 2012-08-29 | 北大方正集团有限公司 | Index establishing and searching methods, devices and systems for audio-video file |
CN104078044A (en) * | 2014-07-02 | 2014-10-01 | 深圳市中兴移动通信有限公司 | Mobile terminal and sound recording search method and device of mobile terminal |
CN105045828A (en) * | 2015-06-26 | 2015-11-11 | 徐信 | Retrieval system and method for accurate positioning of audio/video speech information |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110275979A (en) * | 2019-07-01 | 2019-09-24 | 成都启英泰伦科技有限公司 | A kind of mapping management process of voice data and text data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111898643B (en) | Semantic matching method and device | |
CN107943877B (en) | Method and device for generating multimedia content to be played | |
US11295069B2 (en) | Speech to text enhanced media editing | |
US20230259712A1 (en) | Sound effect adding method and apparatus, storage medium, and electronic device | |
US9858330B2 (en) | Content categorization system | |
CN110650250B (en) | Method, system, device and storage medium for processing voice conversation | |
US20130304471A1 (en) | Contextual Voice Query Dilation | |
KR102391839B1 (en) | Method and device for processing user personal, server and storage medium | |
CN109829164B (en) | Method and device for generating text | |
CN104599692A (en) | Recording method and device and recording content searching method and device | |
US20200218760A1 (en) | Music search method and device, server and computer-readable storage medium | |
CN103942328A (en) | Video retrieval method and video device | |
CN107680584B (en) | Method and device for segmenting audio | |
CN110942765B (en) | Method, device, server and storage medium for constructing corpus | |
CN113468196B (en) | Method, apparatus, system, server and medium for processing data | |
CN108874815A (en) | The search method and device of audio-video | |
CN110019923A (en) | The lookup method and device of speech message | |
CN109213971A (en) | The generation method and device of court's trial notes | |
CN110019295B (en) | Database retrieval method, device, system and storage medium | |
CN108984572A (en) | Site information method for pushing and device | |
CN114490510A (en) | Text stream filing method and device, computer equipment and storage medium | |
JP6115487B2 (en) | Information collecting method, dialogue system, and information collecting apparatus | |
CN112148751B (en) | Method and device for querying data | |
US11269951B2 (en) | Indexing variable bit stream audio formats | |
CN109710844A (en) | The method and apparatus for quick and precisely positioning file based on search engine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing Applicant after: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd. Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing Applicant before: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd. |
|
CB02 | Change of applicant information | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181123 |
|
RJ01 | Rejection of invention patent application after publication |