CN108874815A - The search method and device of audio-video - Google Patents

The search method and device of audio-video Download PDF

Info

Publication number
CN108874815A
CN108874815A CN201710328019.2A CN201710328019A CN108874815A CN 108874815 A CN108874815 A CN 108874815A CN 201710328019 A CN201710328019 A CN 201710328019A CN 108874815 A CN108874815 A CN 108874815A
Authority
CN
China
Prior art keywords
video
audio
sentence
retrieval
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710328019.2A
Other languages
Chinese (zh)
Inventor
王晓涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201710328019.2A priority Critical patent/CN108874815A/en
Publication of CN108874815A publication Critical patent/CN108874815A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/54Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses a kind of search method of audio-video and devices.This method includes:Obtain retrieval sentence, wherein retrieval sentence is for retrieving target audio-video;The multiple sentences for retrieving sentence with index is directed toward are matched, are obtained and the retrieval successful sentence of statement matching, wherein multiple sentences are the corresponding multiple sentences of each audio-video in audio-video set;Return to audio-video corresponding with the retrieval successful sentence of statement matching;And target audio-video is determined in corresponding audio-video.By the application, solve the problems, such as lower to the accuracy of audio-video retrieval in the related technology.

Description

The search method and device of audio-video
Technical field
This application involves audio and video technology process fields, in particular to the search method and device of a kind of audio-video.
Background technique
In order to be retrieved to audio, video file, in this way especially for telephonograph, interview recording, interview video Audio, video file.In general, then being retrieved to title by carrying out full-text index to audio, video files names. However the program can only retrieve audio title, video name, can not know audio, the content of video, find out in this way File may be not intended to.Brief introduction is increased to audio, video file in the related technology, then title and brief introduction are carried out Full-text search.Although corresponding audio, video file can be found according to brief introduction, the brief introduction of audio, video file is arranged It is also required to a large amount of cost of labor, if the content of audio-video brief introduction and audio-video does not correspond to, retrieval accuracy will be substantially reduced.
For problem lower to the accuracy of audio-video retrieval in the related technology, effective solution side is not yet proposed at present Case.
Summary of the invention
The main purpose of the application is to provide the search method and device of a kind of audio-video, right in the related technology to solve The lower problem of the accuracy of audio-video retrieval.
To achieve the goals above, according to the one aspect of the application, a kind of search method of audio-video is provided.The party Method includes:Obtain retrieval sentence, wherein the retrieval sentence is for retrieving target audio-video;By the retrieval sentence The multiple sentences being directed toward with index are matched, and are obtained and the successful sentence of the retrieval statement matching, wherein the multiple language Sentence is the corresponding multiple sentences of each audio-video in audio-video set;It returns and the successful sentence pair of the retrieval statement matching The audio-video answered;And the target audio-video is determined in the corresponding audio-video.
Further, before being matched the retrieval sentence with multiple sentences that index is directed toward, the method is also Including:Each audio-video in the audio-video set is converted into corresponding text respectively;According to the first preset condition to every The corresponding text of a audio-video is split, and the corresponding multiple sentences of each audio-video are obtained;Determine each sentence corresponding Audio-video in initial position and end position.
Further, after determining the target audio-video in the corresponding audio-video, the method also includes:Root Initial position and end position according to each sentence in the target audio-video, determine object statement in the target audio-video Corresponding initial position and end position, wherein object statement be the target audio-video in the retrieval statement matching at The sentence of function;By the object statement, corresponding initial position and end position are sent to audio-video in the target audio-video Player, wherein receive corresponding initial position and end position in the target audio-video in the audio/video player Later, the initial position that the audio/video player jumps in the target audio-video starts to play.
Further, it is determined that initial position and end position of each sentence in corresponding audio-video include:According to Each audio-video is divided into the audio-video segment of preset length by the second preset condition;Each audio-video segment is converted into correspondence Text information;By the corresponding multiple sentences of each audio-video in sequence relationship one by one with current audio-video middle pitch video clip Corresponding text information is matched, and determines initial position and end position of each sentence in current audio-video.
Further, before being matched according to the retrieval sentence with multiple sentences that index is directed toward, the method Further include:Multiple sentences corresponding to each audio-video are based on target information and carry out creation index, wherein the target information packet It includes:Sentence content, the sentence initial position and end position in corresponding audio-video, corresponding audio-video name Claim.
To achieve the goals above, according to the another aspect of the application, a kind of retrieval device of audio-video is provided.The dress Set including:First acquisition unit, for obtaining retrieval sentence, wherein the retrieval sentence is for examining target audio-video Rope;Matching unit obtains and the retrieval sentence for matching the retrieval sentence with multiple sentences that index is directed toward The sentence of successful match, wherein the multiple sentence is the corresponding multiple sentences of each audio-video in audio-video set;Second Acquiring unit, for returning to audio-video corresponding with the retrieval successful sentence of statement matching;And first determination unit, it uses In the target audio-video determining in the corresponding audio-video.
Further, which includes:Converting unit, in the multiple sentences for being directed toward the retrieval sentence and index Before being matched, each audio-video in the audio-video set is converted into corresponding text respectively;Split cells is used for The corresponding text of each audio-video is split according to the first preset condition, obtains the corresponding multiple sentences of each audio-video; Second determination unit, for determining initial position and end position of each sentence in corresponding audio-video.
Further, described device further includes:Third determination unit, described in being determined in the corresponding audio-video After target audio-video, according to the initial position of each sentence and end position in the target audio-video, object statement is determined Corresponding initial position and end position in the target audio-video, wherein object statement be the target audio-video in The successful sentence of the retrieval statement matching;Transmission unit, for the object statement is corresponding in the target audio-video Initial position and end position be sent to audio/video player, wherein receive the target in the audio/video player In audio-video after corresponding initial position and end position, the audio/video player is jumped in the target audio-video Initial position starts to play.
To achieve the goals above, according to the another aspect of the application, a kind of storage medium, the storage medium are provided Program including storage, wherein described program executes the search method of audio-video described in above-mentioned any one.
To achieve the goals above, according to the another aspect of the application, a kind of processor is provided, the processor is used for Run program, wherein described program executes the search method of audio-video described in above-mentioned any one when running.
By the application, using following steps:Obtain retrieval sentence, wherein retrieval sentence be used for target audio-video into Row retrieval;The multiple sentences for retrieving sentence with index is directed toward are matched, obtain and retrieve the successful sentence of statement matching, In, multiple sentences are the corresponding multiple sentences of each audio-video in audio-video set;It returns successful with retrieval statement matching The corresponding audio-video of sentence;And target audio-video is determined in corresponding audio-video.It solves in the related technology to audio-video The lower problem of the accuracy of retrieval.By will retrieve sentence with index direction multiple sentences match, then with inspection Target audio-video is determined in the corresponding audio-video of the successful sentence of rope statement matching, and then has reached what audio-video was retrieved in promotion The effect of accuracy.
Detailed description of the invention
The attached drawing constituted part of this application is used to provide further understanding of the present application, the schematic reality of the application Example and its explanation are applied for explaining the application, is not constituted an undue limitation on the present application.In the accompanying drawings:
Fig. 1 is the flow chart according to the search method of audio-video provided by the embodiments of the present application;And
Fig. 2 is the schematic diagram according to the retrieval device of audio-video provided by the embodiments of the present application.
Specific embodiment
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only The embodiment of the application a part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people Member's every other embodiment obtained without making creative work, all should belong to the model of the application protection It encloses.
It should be noted that the description and claims of this application and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way Data be interchangeable under appropriate circumstances, so as to embodiments herein described herein.In addition, term " includes " and " tool Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a series of steps or units Process, method, system, product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include without clear Other step or units listing to Chu or intrinsic for these process, methods, product or equipment.
According to an embodiment of the present application, a kind of search method of audio-video is provided.
Fig. 1 is the flow chart according to the search method of the audio-video of the embodiment of the present application.As shown in Figure 1, this method includes Following steps:
Step S101 obtains retrieval sentence, wherein retrieval sentence is for retrieving target audio-video.
It is required to look up to search user in multiple audios such as telephonograph, interview recording, interview video, video file Target audio-video, receive user input retrieval sentence, based on retrieval sentence target audio-video is retrieved.Alternatively, connecing The voice for receiving user's input converts the speech into text by speech recognition technology, as retrieval sentence.
Step S102, by retrieve sentence with index be directed toward multiple sentences match, obtain with retrieve statement matching at The sentence of function, wherein multiple sentences are the corresponding multiple sentences of each audio-video in audio-video set.
By matching the multiple sentences for retrieving sentence with index is directed toward, if retrieving statement matching to corresponding language Sentence, namely obtain and the retrieval successful sentence of statement matching.Multiple sentences are each sound in audio-video set in this application The corresponding multiple sentences of video.In the search method of audio-video provided by the embodiments of the present application, sentence and index will retrieved Before the multiple sentences being directed toward are matched, this method further includes:Each audio-video in audio-video set is converted to respectively Corresponding text;The corresponding text of each audio-video is split according to the first preset condition, it is corresponding to obtain each audio-video Multiple sentences;Determine initial position and end position of each sentence in corresponding audio-video.
It should be noted that the index (for example, inverted index) in the search engine that index mentioned in this application refers to, The search engine used in this application can be ElasticSearch or Solr etc..
Specifically, in this application using within the scope of user search all multiple audios, video file is as audio-video collection It closes, each audio-video in audio-video set is converted into corresponding text, for example, by the telephonograph 1 in audio-video set Be converted to corresponding text;Telephonograph 2 in audio-video set is converted into corresponding text etc..
For example, (such as according to specific punctuation mark:Fullstop, question mark, branch etc.) (corresponding the first above-mentioned preset condition) Text corresponding after conversion is split, the corresponding multiple sentences of each audio-video are obtained after fractionation, is completed after fractionation each The corresponding multiple sentences of audio-video can store in tables of data, each in tables of data is a sentence.
Optionally, in the search method of audio-video provided by the embodiments of the present application, determine each sentence corresponding Initial position and end position in audio-video include:Each audio-video is divided into preset length according to the second preset condition Audio-video segment;Each audio-video segment is converted into corresponding text information;The corresponding multiple sentences of each audio-video are pressed According to ordinal relation, text information corresponding with current audio-video middle pitch video clip is matched one by one, determines that each sentence is being worked as Initial position and end position in preceding audio-video.
For example, being completed after the corresponding multiple sentences of each audio-video are stored in tables of data after to fractionation, obtaining should First sentence item1 in tables of data, starting position are labeled as 0.It is intercepted since the starting position in audio-video one long The segment s1 for being L, such as L=5 seconds are spent, then the starting position of s1 is 0, end position 5s, then by the audio-video segment It is converted into text t1, the text t1 after conversion is matched with item1.If item1 includes t1, t1 is the clause of item1, And t1 and item1 can be exactly matched, then illustrating that the starting position of item1 is 0, end position is 5, can thus be located Manage next sentence item2.
If t1 is the several words of beginning of item1, such as t1=my today, then need to cut from s1 end position backward Taking length is second audio fragment s2 of L, and audio fragment s2 is then converted into text t2;If t2 is the ending of item1 Word string, then the end position of item1 is the end position of s2, as 10s.Next language can be matched after the completion of this matching Sentence item2.
If a part of t2 is the word string of item1, if t2=work will make my needs, then t2 is equivalent to across two Words reduce the length of s2 segment, then carry out the matching of sub-piece.Such as the sub-piece that a length is 2s is intercepted from s2, It then is just " work will be done " after converting, then the end position of item1 should be 5+2=7s;Next language is carried out in this way Sentence item2 matching when, audio fragment from 7s intercept backward.
If t2 does not reach the ending of item1, continue to intercept s3 sound bite, successively iteration.It has handled in data After one sentence, next statement is handled, is completed until all sentences are handled, the tables of data eventually formed, such as the following table 1:
Table 1
Sentence Starting position End position
I has many work to do today. 0 6
I needs till all hours of working overtime, and cannot come off duty on time. 6 15
Tomorrow is can to rest at weekend two days. 15 24
Optionally, in the search method of audio-video provided by the embodiments of the present application, referred to according to retrieval sentence with index To multiple sentences matched before, this method further includes:Multiple sentences corresponding to each audio-video are based on target information Carry out creation index, wherein target information includes:The initial position and knot of sentence content, sentence in corresponding audio-video The title of beam position, corresponding audio-video.
Obtain include multiple sentences, the starting position of each sentence and end position tables of data after, be based on sentence Content, sentence initial position and end position in corresponding audio-video, corresponding audio-video title to each sound The corresponding multiple sentences of video are based on target information and carry out creation index.After to tables of data creation index, in retrieval language Sentence is matched with multiple sentences that index is directed toward, after successful match, the content of the available sentence to successful match, language Initial position and end position, the corresponding title of audio-video etc. of the sentence in corresponding audio-video.
It should be noted that above-mentioned target information can also include the storage address of audio-video etc. information.In this Shen It please be to have been carried out to the specifying information of target information for example, but being not limited to this.
Step S103 returns to audio-video corresponding with the retrieval successful sentence of statement matching.
Step S104 determines target audio-video in corresponding audio-video.
For example, it is as follows to return to audio-video corresponding with the retrieval successful sentence of statement matching:Telephonograph 1, interview recording 1, video 3, interview video 5 etc. are interviewed.The target audio-video that user requires to look up is determined in multiple corresponding audio-videos.
By matching the multiple sentences for retrieving sentence with index is directed toward, then successful with retrieval statement matching Target audio-video is determined in the corresponding audio-video of sentence, promotes the accuracy retrieved to audio-video.
Optionally, in the search method of audio-video provided by the embodiments of the present application, mesh is determined in corresponding audio-video After mark with phonetic symbols video, this method further includes:According to the initial position of sentence each in target audio-video and end position, mesh is determined Poster sentence corresponding initial position and end position in target audio-video, wherein object statement be target audio-video in inspection The successful sentence of rope statement matching;By object statement, corresponding initial position and end position are sent to sound in target audio-video Video player, wherein it is received in target audio-video after corresponding initial position and end position in audio/video player, The initial position that audio/video player jumps in target audio-video starts to play.
For example, target audio-video be interview video 3, interview video 3 in retrieval the successful sentence of statement matching be " I Till all hours of working overtime are needed, cannot be come off duty on time ", determine " I needs till all hours of working overtime, and cannot come off duty on time " in interview video 3 Initial position be 6 and end position 15, by interview video 3 in object statement initial position be 6 and end position 15 send To audio/video player.Audio/video player jumps to initial position 6 in interview video 3 and carries out starting to play.
Through the above scheme, it by the way that user is inputted one section of voice or one section of text, can be looked into multiple audio-video documents Corresponding audio-video is found, target audio-video is determined from corresponding audio-video, and audio/video player may be implemented to jump It goes to starting position of the retrieval sentence in target audio-video to play out, be promoted to audio-video effectiveness of retrieval, also promotion pair The accuracy of audio-video retrieval and user experience.
In conclusion the search method of audio-video provided by the embodiments of the present application, retrieves sentence by obtaining, wherein inspection Rope sentence is for retrieving target audio-video;By retrieve sentence with index be directed toward multiple sentences match, obtain and Retrieve the successful sentence of statement matching, wherein multiple sentences are the corresponding multiple sentences of each audio-video in audio-video set; Return to audio-video corresponding with the retrieval successful sentence of statement matching;And target audio-video is determined in corresponding audio-video. It solves the problems, such as lower to the accuracy of audio-video retrieval in the related technology.By that will retrieve sentence and index the multiple of direction Sentence is matched, and then determines target audio-video in audio-video corresponding with the retrieval successful sentence of statement matching, in turn Achieve the effect that promote the accuracy for retrieving audio-video.
It should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions It is executed in computer system, although also, logical order is shown in flow charts, and it in some cases, can be with not The sequence being same as herein executes shown or described step.
The embodiment of the present application also provides a kind of retrieval devices of audio-video, it should be noted that the embodiment of the present application The retrieval device of audio-video can be used for executing the search method that audio-video is used for provided by the embodiment of the present application.Below to this The retrieval device for the audio-video that application embodiment provides is introduced.
Fig. 2 is the schematic diagram according to the retrieval device of the audio-video of the embodiment of the present application.As shown in Fig. 2, the device includes: First acquisition unit 10, matching unit 20, second acquisition unit 30 and the first determination unit 40.
Specifically, first acquisition unit 10, for obtaining retrieval sentence, wherein retrieval sentence is used for target audio-video It is retrieved.
Matching unit 20 obtains and retrieval sentence for matching the multiple sentences for retrieving sentence with index is directed toward The sentence of successful match, wherein multiple sentences are the corresponding multiple sentences of each audio-video in audio-video set.
Second acquisition unit 30, for returning to audio-video corresponding with the successful sentence of statement matching is retrieved.
First determination unit 40, for determining target audio-video in corresponding audio-video.
The retrieval device of audio-video provided by the embodiments of the present application obtains retrieval sentence by first acquisition unit 10, In, retrieval sentence is for retrieving target audio-video;Matching unit 20 will retrieve multiple sentences of sentence and index direction It is matched, is obtained and the retrieval successful sentence of statement matching, wherein multiple sentences are each audio-video in audio-video set Corresponding multiple sentences.Second acquisition unit 30 returns to audio-video corresponding with the retrieval successful sentence of statement matching.First really Order member 40 in corresponding audio-video determine target audio-video, solve in the related technology to audio-video retrieval accuracy compared with Low problem, by matching the multiple sentences for retrieving sentence with index is directed toward, then successful with retrieval statement matching The corresponding audio-video of sentence in determine target audio-video, and then achieved the effect that promote the accuracy of retrieving audio-video.
Optionally, in the retrieval device of audio-video provided by the embodiments of the present application, which includes:Converting unit is used In will retrieve sentence with index direction multiple sentences matched before, respectively by each audio-video in audio-video set Be converted to corresponding text;Split cells, for being split according to the first preset condition to the corresponding text of each audio-video, Obtain the corresponding multiple sentences of each audio-video;Second determination unit, for determining each sentence in corresponding audio-video Initial position and end position.
Optionally, in the retrieval device of audio-video provided by the embodiments of the present application, which further includes:Third determines single Member, after determining target audio-video in corresponding audio-video, according to the initial position of sentence each in target audio-video And end position, determine object statement corresponding initial position and end position in target audio-video, wherein object statement is In target audio-video with retrieval the successful sentence of statement matching;Transmission unit, for object statement is right in target audio-video The initial position and end position answered are sent to audio/video player, wherein receive target audio-video in audio/video player In after corresponding initial position and end position, the initial position that audio/video player jumps in target audio-video starts to broadcast It puts.
The retrieval device of the audio-video includes processor and memory, above-mentioned first acquisition unit 10, matching unit 20, Second acquisition unit 30 and the first determination unit 40 etc. store in memory as program unit, execute storage by processor Above procedure unit in memory realizes corresponding function.
Include kernel in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can be set one Or more, carry out retrieving audio/video frequency by adjusting kernel parameter.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, if read-only memory (ROM) or flash memory (flash RAM), memory include that at least one is deposited Store up chip.
The embodiment of the invention provides a kind of storage mediums, are stored thereon with program, real when which is executed by processor The search method of the existing audio-video.
The embodiment of the invention provides a kind of processor, the processor is for running program, wherein described program operation The search method of audio-video described in Shi Zhihang.
The embodiment of the invention provides a kind of equipment, equipment include processor, memory and storage on a memory and can The program run on a processor, processor realize following steps when executing program:Obtain retrieval sentence, wherein the retrieval Sentence is for retrieving target audio-video;The retrieval sentence is matched with multiple sentences that index is directed toward, is obtained With the successful sentence of the retrieval statement matching, wherein the multiple sentence is that each audio-video in audio-video set is corresponding Multiple sentences;Return to audio-video corresponding with the retrieval successful sentence of statement matching;And it is regarded in the corresponding sound The target audio-video is determined in frequency.
Before the retrieval sentence is matched with multiple sentences that index is directed toward, the method also includes:Respectively Each audio-video in the audio-video set is converted into corresponding text;According to the first preset condition to each audio-video pair The text answered is split, and the corresponding multiple sentences of each audio-video are obtained;Determine each sentence in corresponding audio-video Initial position and end position.
After determining the target audio-video in the corresponding audio-video, the method also includes:According to the mesh The initial position of each sentence and end position in mark with phonetic symbols video determine object statement corresponding in the target audio-video Beginning position and end position, wherein object statement be in the target audio-video with the successful sentence of the retrieval statement matching; By the object statement, corresponding initial position and end position are sent to audio/video player in the target audio-video, In, it is received in the target audio-video after corresponding initial position and end position in the audio/video player, it is described The initial position that audio/video player jumps in the target audio-video starts to play.
Determine that initial position and end position of each sentence in corresponding audio-video include:According to the second default item Each audio-video is divided into the audio-video segment of preset length by part;Each audio-video segment is converted into corresponding text letter Breath;By the corresponding multiple sentences of each audio-video relationship text corresponding with current audio-video middle pitch video clip one by one in sequence Word information is matched, and determines initial position and end position of each sentence in current audio-video.
Before being matched according to the retrieval sentence with multiple sentences that index is directed toward, the method also includes:It is right The corresponding multiple sentences of each audio-video are based on target information and carry out creation index, wherein the target information includes:In sentence Hold, initial position and end position of the sentence in corresponding audio-video, corresponding audio-video title.Herein Equipment can be server, PC, PAD, mobile phone etc..
Present invention also provides a kind of computer program products, when executing on data processing equipment, are adapted for carrying out just The program of beginningization there are as below methods step:Obtain retrieval sentence, wherein the retrieval sentence is for examining target audio-video Rope;The retrieval sentence is matched with multiple sentences that index is directed toward, is obtained and the successful language of the retrieval statement matching Sentence, wherein the multiple sentence is the corresponding multiple sentences of each audio-video in audio-video set;It returns and the retrieval language The corresponding audio-video of sentence of sentence successful match;And the target audio-video is determined in the corresponding audio-video.
Before the retrieval sentence is matched with multiple sentences that index is directed toward, the method also includes:Respectively Each audio-video in the audio-video set is converted into corresponding text;According to the first preset condition to each audio-video pair The text answered is split, and the corresponding multiple sentences of each audio-video are obtained;Determine each sentence in corresponding audio-video Initial position and end position.
After determining the target audio-video in the corresponding audio-video, the method also includes:According to the mesh The initial position of each sentence and end position in mark with phonetic symbols video determine object statement corresponding in the target audio-video Beginning position and end position, wherein object statement be in the target audio-video with the successful sentence of the retrieval statement matching; By the object statement, corresponding initial position and end position are sent to audio/video player in the target audio-video, In, it is received in the target audio-video after corresponding initial position and end position in the audio/video player, it is described The initial position that audio/video player jumps in the target audio-video starts to play.
Determine that initial position and end position of each sentence in corresponding audio-video include:According to the second default item Each audio-video is divided into the audio-video segment of preset length by part;Each audio-video segment is converted into corresponding text letter Breath;By the corresponding multiple sentences of each audio-video relationship text corresponding with current audio-video middle pitch video clip one by one in sequence Word information is matched, and determines initial position and end position of each sentence in current audio-video.
Before being matched according to the retrieval sentence with multiple sentences that index is directed toward, the method also includes:It is right The corresponding multiple sentences of each audio-video are based on target information and carry out creation index, wherein the target information includes:In sentence Hold, initial position and end position of the sentence in corresponding audio-video, corresponding audio-video title.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable Jie The example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including element There is also other identical elements in process, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can provide as method, system or computer program product. Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) Formula.
The above is only embodiments herein, are not intended to limit this application.To those skilled in the art, Various changes and changes are possible in this application.It is all within the spirit and principles of the present application made by any modification, equivalent replacement, Improve etc., it should be included within the scope of the claims of this application.

Claims (10)

1. a kind of search method of audio-video, which is characterized in that including:
Obtain retrieval sentence, wherein the retrieval sentence is for retrieving target audio-video;
The retrieval sentence is matched with multiple sentences that index is directed toward, is obtained and the successful language of the retrieval statement matching Sentence, wherein the multiple sentence is the corresponding multiple sentences of each audio-video in audio-video set;
Return to audio-video corresponding with the retrieval successful sentence of statement matching;And
The target audio-video is determined in the corresponding audio-video.
2. the method according to claim 1, wherein in the multiple sentences for being directed toward the retrieval sentence and index Before being matched, the method also includes:
Each audio-video in the audio-video set is converted into corresponding text respectively;
The corresponding text of each audio-video is split according to the first preset condition, obtains the corresponding multiple languages of each audio-video Sentence;
Determine initial position and end position of each sentence in corresponding audio-video.
3. according to the method described in claim 2, it is characterized in that, determining the target sound view in the corresponding audio-video After frequency, the method also includes:
According to the initial position of each sentence and end position in the target audio-video, determine object statement in the target sound Corresponding initial position and end position in video, wherein object statement be the target audio-video in the retrieval sentence The sentence of successful match;
By the object statement, corresponding initial position and end position are sent to audio and video playing in the target audio-video Device, wherein it is received in the target audio-video after corresponding initial position and end position in the audio/video player, The initial position that the audio/video player jumps in the target audio-video starts to play.
4. according to the method described in claim 2, it is characterized in that, determining starting of each sentence in corresponding audio-video Position and end position include:
Each audio-video is divided into the audio-video segment of preset length according to the second preset condition;
Each audio-video segment is converted into corresponding text information;
By the corresponding multiple sentences of each audio-video, relationship is corresponding with current audio-video middle pitch video clip one by one in sequence Text information is matched, and determines initial position and end position of each sentence in current audio-video.
5. according to the method described in claim 2, it is characterized in that, in the multiple languages being directed toward according to the retrieval sentence and index Before sentence is matched, the method also includes:
Multiple sentences corresponding to each audio-video are based on target information and carry out creation index, wherein the target information includes: Sentence content, the sentence initial position and end position in corresponding audio-video, corresponding audio-video title.
6. a kind of retrieval device of audio-video, which is characterized in that including:
First acquisition unit, for obtaining retrieval sentence, wherein the retrieval sentence is for retrieving target audio-video;
Matching unit obtains and the retrieval language for matching the retrieval sentence with multiple sentences that index is directed toward The sentence of sentence successful match, wherein the multiple sentence is the corresponding multiple sentences of each audio-video in audio-video set;
Second acquisition unit, for returning to audio-video corresponding with the retrieval successful sentence of statement matching;And
First determination unit, for determining the target audio-video in the corresponding audio-video.
7. device according to claim 6, which is characterized in that described device further includes:
Converting unit, for by it is described retrieval sentence with index be directed toward multiple sentences matched before, respectively will described in Each audio-video in audio-video set is converted to corresponding text;
Split cells obtains each sound view for splitting according to the first preset condition to the corresponding text of each audio-video Frequently corresponding multiple sentences;
Second determination unit, for determining initial position and end position of each sentence in corresponding audio-video.
8. device according to claim 7, which is characterized in that described device further includes:
Third determination unit, after determining the target audio-video in the corresponding audio-video, according to the target The initial position of each sentence and end position in audio-video determine object statement corresponding starting in the target audio-video Position and end position, wherein object statement be the target audio-video in the successful sentence of the retrieval statement matching;
Transmission unit, for corresponding initial position and end position to be sent in the target audio-video by the object statement To audio/video player, wherein the audio/video player receive in the target audio-video corresponding initial position and After end position, the initial position that the audio/video player jumps in the target audio-video starts to play.
9. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein described program right of execution Benefit require any one of 1 to 5 described in audio-video search method.
10. a kind of processor, which is characterized in that the processor is for running program, wherein right of execution when described program is run Benefit require any one of 1 to 5 described in audio-video search method.
CN201710328019.2A 2017-05-10 2017-05-10 The search method and device of audio-video Pending CN108874815A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710328019.2A CN108874815A (en) 2017-05-10 2017-05-10 The search method and device of audio-video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710328019.2A CN108874815A (en) 2017-05-10 2017-05-10 The search method and device of audio-video

Publications (1)

Publication Number Publication Date
CN108874815A true CN108874815A (en) 2018-11-23

Family

ID=64319325

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710328019.2A Pending CN108874815A (en) 2017-05-10 2017-05-10 The search method and device of audio-video

Country Status (1)

Country Link
CN (1) CN108874815A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110275979A (en) * 2019-07-01 2019-09-24 成都启英泰伦科技有限公司 A kind of mapping management process of voice data and text data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070087312A1 (en) * 2005-10-18 2007-04-19 Cheertek Inc. Method for separating sentences in audio-video display system
CN102650993A (en) * 2011-02-25 2012-08-29 北大方正集团有限公司 Index establishing and searching methods, devices and systems for audio-video file
CN104078044A (en) * 2014-07-02 2014-10-01 深圳市中兴移动通信有限公司 Mobile terminal and sound recording search method and device of mobile terminal
CN105045828A (en) * 2015-06-26 2015-11-11 徐信 Retrieval system and method for accurate positioning of audio/video speech information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070087312A1 (en) * 2005-10-18 2007-04-19 Cheertek Inc. Method for separating sentences in audio-video display system
CN102650993A (en) * 2011-02-25 2012-08-29 北大方正集团有限公司 Index establishing and searching methods, devices and systems for audio-video file
CN104078044A (en) * 2014-07-02 2014-10-01 深圳市中兴移动通信有限公司 Mobile terminal and sound recording search method and device of mobile terminal
CN105045828A (en) * 2015-06-26 2015-11-11 徐信 Retrieval system and method for accurate positioning of audio/video speech information

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110275979A (en) * 2019-07-01 2019-09-24 成都启英泰伦科技有限公司 A kind of mapping management process of voice data and text data

Similar Documents

Publication Publication Date Title
CN111898643B (en) Semantic matching method and device
CN107943877B (en) Method and device for generating multimedia content to be played
US11295069B2 (en) Speech to text enhanced media editing
US20230259712A1 (en) Sound effect adding method and apparatus, storage medium, and electronic device
US9858330B2 (en) Content categorization system
CN110650250B (en) Method, system, device and storage medium for processing voice conversation
US20130304471A1 (en) Contextual Voice Query Dilation
KR102391839B1 (en) Method and device for processing user personal, server and storage medium
CN109829164B (en) Method and device for generating text
CN104599692A (en) Recording method and device and recording content searching method and device
US20200218760A1 (en) Music search method and device, server and computer-readable storage medium
CN103942328A (en) Video retrieval method and video device
CN107680584B (en) Method and device for segmenting audio
CN110942765B (en) Method, device, server and storage medium for constructing corpus
CN113468196B (en) Method, apparatus, system, server and medium for processing data
CN108874815A (en) The search method and device of audio-video
CN110019923A (en) The lookup method and device of speech message
CN109213971A (en) The generation method and device of court's trial notes
CN110019295B (en) Database retrieval method, device, system and storage medium
CN108984572A (en) Site information method for pushing and device
CN114490510A (en) Text stream filing method and device, computer equipment and storage medium
JP6115487B2 (en) Information collecting method, dialogue system, and information collecting apparatus
CN112148751B (en) Method and device for querying data
US11269951B2 (en) Indexing variable bit stream audio formats
CN109710844A (en) The method and apparatus for quick and precisely positioning file based on search engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Applicant before: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20181123

RJ01 Rejection of invention patent application after publication