CN100565532C - A kind of multimedia resource search method based on the audio content retrieval - Google Patents

A kind of multimedia resource search method based on the audio content retrieval Download PDF

Info

Publication number
CN100565532C
CN100565532C CNB2008100620738A CN200810062073A CN100565532C CN 100565532 C CN100565532 C CN 100565532C CN B2008100620738 A CNB2008100620738 A CN B2008100620738A CN 200810062073 A CN200810062073 A CN 200810062073A CN 100565532 C CN100565532 C CN 100565532C
Authority
CN
China
Prior art keywords
voice
index
keyword
meaning
identified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2008100620738A
Other languages
Chinese (zh)
Other versions
CN101281534A (en
Inventor
叶睿智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Micro Network Co Ltd
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CNB2008100620738A priority Critical patent/CN100565532C/en
Publication of CN101281534A publication Critical patent/CN101281534A/en
Application granted granted Critical
Publication of CN100565532C publication Critical patent/CN100565532C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a kind of multimedia resource search method based on the audio content retrieval.Comprising the steps: 1) preprocessing server is converted into video and audio frequency the voice to be identified of standard; 2) speech recognition server is trained to acoustic model with corpus, and voice to be identified and acoustics Model Matching are obtained meaning of one's words text index; 3) keyword index is stored and organized to index server, and the coupling search condition obtains result for retrieval.The present invention utilizes that keyword detects technology in the audio frequency, obtained the meaning of one's words information of audio and video resources inherence, meaning of one's words information to textization is carried out index, reliable more comprehensively audio and video resources information index is provided, can allow searching system match multimedia resource more accurately, and navigate to the exact position that term occurs in audio frequency and video.

Description

A kind of multimedia resource search method based on the audio content retrieval
Technical field
The present invention relates to a kind of multimedia resource search method, particularly relate to and be used to retrieve the resource that comprises video, audio form, find the resource that comprises institute's retrieving information and provide the institute location of retrieving information in resource based on the audio content retrieval.
Background technology
In current digitizing and network times, multi-medium data has become the major part of the data that transmit on the internet information highway.Content of multimedia such as audio frequency, image and video occupy 15% at present in the internet, and should numeral also in growth at full speed.The high-capacity and high-speed storage system provides basic guarantee for the mass memory of audio frequency and video, and every profession and trade to the use of audio frequency and video more and more widely.How obtaining Useful Information from the audio/video information of magnanimity, promptly the management and retrieval of audio/video information resource seem and become more and more important, and audio frequency and video have become the network user's one of resource of frequent retrieval.Present main flow search engine such as Google, Yahoo, Baidu etc., solved retrieval preferably to content of text on the internet, but aspect the audio frequency and video retrieval, these search engines still rely on the relevant peripheral text message (for example resource file name, resource mark, resource are introduced text etc.) of match retrieval multimedia resource to realize the search to text at present.This makes that the content of audio and video resources self is not discerned well, and some lack the searched engine of useful resources of clear and definite textual description and ignore.At present, information retrieval system commonly used, for example digital library system, Knowledge Management System are faced with all also that multimedia resource becomes the important information carrier day by day but the problem that lacks effective retrieval method.One of approach that addresses this problem utilizes speech recognition technology to retrieve the corresponding informance that usable text is expressed from the phonological component of audio and video resources exactly, goes these resources of index by the mode of text retrieval again.
Key word recognition is exactly to identify given keyword in continuous unconfined natural-sounding stream.It comprises the substance of two aspects, and one is that keyword detects, and one is keyword recognition.It is to differentiate that voice comprise the keyword which is imported in advance that keyword detects, and is a polynary decision problem.Keyword recognition then answers "Yes" or " not being " comprises this keyword, is two yuan of decision problems, and the special finger keyword of key word recognition technology of the present invention detects.
(Mel-FrequencyCepstrumCoefficients MFCC) is based upon on Fourier and the cepstral analysis basis Mei Er cepstrum coefficient, has reacted the frequency domain character of sound signal.Sampled point in the audio frame is in short-term carried out Fourier transform, obtain this energy size of audio frame on each frequency in short-term, whole frequency band is divided into n subband, calculates the gross energy on each subband of this n respectively, just constituted this n Mel coefficient of audio frame in short-term.The Mel coefficient that extracts is calculated its corresponding cepstrum coefficient again, is exactly the Mel cepstrum coefficient.Cepstral analysis is a kind of nonlinear signal processing technology, and it is the basis of homomorphic system theory, is the signal that special disposal is combined by convolution, is applied to afterwards in the processing of voice signal.
(Hidden Markov Model, HMM) model is a kind of probabilistic model based on transition probability and output probability to hidden Markov.It regards voice as be made up of observable symbol sebolic addressing stochastic process, and symbol sebolic addressing then is the output of sonification system status switch.When using hidden Markov model identification,, obtain state transition probability matrix and symbol output probability matrix by training for each speaker sets up sonification model.Calculate the maximum probability of unknown voice in the state transitions process during identification, adjudicate according to the model of maximum probability correspondence.The ergodic type HMM of the general employing of Speaker Identification for text-independent generally adopts from left to right type HMM for the Speaker Identification relevant with text.HMM when not required between consolidation, computing time and memory space when having saved judgement.
Carry out the very important aspect that technology is speech retrieval based on the key word recognition of Hidden Markov Model (HMM), it occupies critical positions in the certain content retrieval of voice, because the restriction of its robustness of speech recognition technology and practicality aspect at present, utilize continuous speech recognition to set up large vocabulary, the recognizer of keyword can not do the trick arbitrarily, can't well satisfy the application requirements of speech retrieval aspect.And key word recognition is relative reliable technique, and the better application prospect is arranged in the application of speech retrieval.
Summary of the invention
The objective of the invention is to overcome the deficiencies in the prior art, a kind of multimedia resource search method based on the audio content retrieval is provided
Comprise the steps:
1) preprocessing server is converted into video and audio frequency the voice to be identified of standard;
2) speech recognition server is trained to acoustic model with corpus, and voice to be identified and acoustics Model Matching are obtained meaning of one's words text index;
3) keyword index is stored and organized to index server, and the coupling search condition obtains result for retrieval.
Described preprocessing server is converted into video and audio frequency the voice to be identified of standard: have one or more snippets speech voice in video and the audio frequency, from the video resource of input, adopt the audio frequency isolation technics to mention the data of audio-frequency unit, and kept consistent at time-axis direction of the audio frequency separated and original video, voice data is through the digital noise reduction technical finesse, the part that energy is low excessively is processed into quiet with the voice segments that comprises noise signals, after the conversion, audio frequency is output as the voice to be identified of standard.
Described speech recognition server is trained to acoustic model with corpus: the broadcasting speech that corpus adopts the standard Chinese mandarin to read aloud, corpus obtains phonetic feature through characteristic extracting module, characteristic type adopts the Mei Er cepstrum coefficient, and phonetic feature obtains the hidden Markov acoustic model through acoustic training model.
Described voice to be identified and acoustics Model Matching are obtained meaning of one's words text index: voice to be identified obtain voice Mei Er cepstrum coefficient feature to be identified through characteristic extracting module, path-searcher reads voice Mei Er cepstrum coefficient feature to be identified, in the path of hidden Markov acoustic model, carry out shortest path identification, obtain the meaning of one's words text of sound bite correspondence; In conjunction with the voice timeline information, output comprises the meaning of one's words text index of meaning of one's words text and time terminal data.
Keyword index is stored and organized to described index server, and the coupling search condition obtains result for retrieval: to the meaning of one's words text index of speech recognition server output, carrying out inverted index transforms, obtaining with meaning of one's words keyword is major key, the inverted index clauses and subclauses that the positional information sequence is a key assignments appear in keyword, store in the index database, when carrying out retrieval, keyword or keyword combination to retrieval server input text form, read the inverted index clauses and subclauses according to search key, the positional information sequence appears in the output keyword.
Described meaning of one's words text index is that a binary is right, has comprised time period starting point and time period endpoint data that text key word, keyword once occur in voice.Keyword index is that a ternary is right, has comprised keyword, the video of keyword correspondence and time period starting point and the time period endpoint data that audio resource document number, keyword once occur in voice.Result for retrieval is to have described a series of resource files that comprise institute's search key, and in certain resource file, a series of time period information of this keyword sound bite occur.
The present invention utilizes that keyword detects technology in the audio frequency, obtained the meaning of one's words information of audio and video resources inherence, meaning of one's words information to textization is carried out index, reliable more comprehensively audio and video resources information index is provided, can allow searching system match multimedia resource more accurately, and navigate to the exact position that term occurs in audio frequency and video.
Description of drawings
Fig. 1 is according to audio-video frequency content searching system overview flow chart of the present invention;
Fig. 2 is according to audio frequency and video pretreatment process figure of the present invention;
Fig. 3 is according to key word recognition process flow diagram of the present invention;
Fig. 4 is according to index merger process flow diagram of the present invention.
Embodiment
Multimedia resource search method based on the audio content retrieval comprises the steps:
1) preprocessing server is converted into video and audio frequency the voice to be identified of standard; As shown in Figure 1, video data 1-1 and voice data 1-2 are imported into preprocessing server S1, obtain the language material 1-3 to be identified of standard through pre-service.
2) speech recognition server is trained to acoustic model with corpus, and voice to be identified and acoustics Model Matching are obtained meaning of one's words text index; As shown in Figure 1, corpus 1-4 is imported into speech recognition server S2, and training obtains acoustic model, is stored among the S2, language material 1-3 to be identified and acoustic model are input to speech recognition server S2 together, and coupling obtains the meaning of one's words text index information 1-5 among the language material 1-3 to be identified.
3) keyword index is stored and organized to index server, and the coupling search condition obtains result for retrieval.As shown in Figure 1, index information is imported into index server S3, and the inverted index storehouse is gone in merger; When retrieving, search condition 1-6 is input to index server S3, and S3 mates the keyword in the condition to be retrieved in index database, obtain matched record, and the merger matched record is finally returned result for retrieval 1-7.
Described preprocessing server is converted into video and audio frequency the voice to be identified of standard: have one or more snippets speech voice in video and the audio frequency, from the video resource of input, adopt the audio frequency isolation technics to mention the data of audio-frequency unit, and kept consistent at time-axis direction of the audio frequency separated and original video, voice data is through the digital noise reduction technical finesse, the part that energy is low excessively is processed into quiet with the voice segments that comprises noise signals, after the conversion, audio frequency is output as the voice to be identified of standard.As shown in Figure 2, the video data 2-1 of input advanced speech data extraction module 2-2, obtained the corresponding audio data; Directly the audio data of the audio data 2-3 of input and video extraction all advanced noise reduction process module 2-4, the voice 2-5 to be identified that final output can be used for discerning.
Described speech recognition server is trained to acoustic model with corpus: the broadcasting speech that corpus adopts the standard Chinese mandarin to read aloud, corpus obtains phonetic feature through characteristic extracting module, characteristic type adopts the Mei Er cepstrum coefficient, and phonetic feature obtains the hidden Markov acoustic model through acoustic training model.As shown in Figure 3, corpus 3-1 obtains sample voice Mei Er cepstrum coefficient characteristic through voice pretreatment module 3-2, characteristic extracting module 3-3; Sample voice Mei Er cepstrum coefficient characteristic is input to engine training and identification module 3-4 trains hidden Markov acoustic model 3-5;
Described voice to be identified and acoustics Model Matching are obtained meaning of one's words text index: voice to be identified obtain voice Mei Er cepstrum coefficient feature to be identified through characteristic extracting module, path-searcher reads voice Mei Er cepstrum coefficient feature to be identified, in the path of hidden Markov acoustic model, carry out shortest path identification, obtain the meaning of one's words text of sound bite correspondence; In conjunction with the voice timeline information, output comprises the meaning of one's words text index of meaning of one's words text and time terminal data.As shown in Figure 3, voice 3-6 to be identified obtains voice Mei Er cepstrum coefficient characteristic to be identified through voice pretreatment module 3-2, characteristic extracting module 3-3; Voice feature data to be identified and hidden Markov acoustic model 3-5 obtain discerning meaning of one's words text index 3-8 through route searching matching module 3-7 together, its form be binary right<KW, Ref 〉, wherein comprised time period starting point and time period endpoint data Ref that text key word KW, keyword once occur in voice.
Keyword index is stored and organized to described index server, and the coupling search condition obtains result for retrieval: to the meaning of one's words text index of speech recognition server output, carrying out inverted index transforms, obtaining with meaning of one's words keyword is major key, the inverted index clauses and subclauses that the positional information sequence is a key assignments appear in keyword, store in the index database.As shown in Figure 4, at first extract indexing key words KW, in having index database now, find inverted index clauses and subclauses<KW according to KW from the single newly-increased index of sound identification module output (form be<KW Ref 〉),<Ref1, Ref2...Refn〉〉, will increase index and inverted index clauses and subclauses again newly and merge, obtain new clauses and subclauses<KW,<Ref1, Ref2...Refn, Refn+1 〉, merging process is considered the repetition of going of index.Last new clauses and subclauses are write back index database.When carrying out retrieval, to the keyword or the keyword combination of retrieval server input text form, read the inverted index clauses and subclauses according to search key, the positional information sequence appears in the output keyword.As shown in Figure 4,, in existing index database, find inverted index clauses and subclauses<KW,<Ref1, Ref2...Refn according to search key KW〉〉, return as result for retrieval.

Claims (4)

1, a kind of multimedia resource search method based on the audio content retrieval is characterized in that comprising the steps:
1) preprocessing server is converted into video and audio frequency the voice to be identified of standard;
2) speech recognition server is trained to acoustic model with corpus, and voice to be identified and acoustics Model Matching are obtained meaning of one's words text index;
3) keyword index is stored and organized to index server, and the coupling search condition obtains result for retrieval;
Described preprocessing server is converted into video and audio frequency the voice to be identified of standard: have one or more snippets speech voice in video and the audio frequency, from the video resource of input, adopt the audio frequency isolation technics to mention the data of audio-frequency unit, and kept consistent at time-axis direction of the audio frequency separated and original video, voice data is through the digital noise reduction technical finesse, the part that energy is low excessively is processed into quiet with the voice segments that comprises noise signals, after the conversion, audio frequency is output as the voice to be identified of standard;
Described speech recognition server is trained to acoustic model with corpus: the broadcasting speech that corpus adopts the standard Chinese mandarin to read aloud, corpus obtains phonetic feature through characteristic extracting module, characteristic type adopts the Mei Er cepstrum coefficient, and phonetic feature obtains the hidden Markov acoustic model through acoustic training model;
Described voice to be identified and acoustics Model Matching are obtained meaning of one's words text index: voice to be identified obtain voice Mei Er cepstrum coefficient feature to be identified through characteristic extracting module, path-searcher reads voice Mei Er cepstrum coefficient feature to be identified, in the path of hidden Markov acoustic model, carry out shortest path identification, obtain the meaning of one's words text of sound bite correspondence; In conjunction with the voice timeline information, output comprises the meaning of one's words text index of meaning of one's words text and time terminal data;
Keyword index is stored and organized to described index server, and the coupling search condition obtains result for retrieval: to the meaning of one's words text index of speech recognition server output, carrying out inverted index transforms, obtaining with meaning of one's words keyword is major key, the inverted index clauses and subclauses that the positional information sequence is a key assignments appear in keyword, store in the index database, when carrying out retrieval, keyword or keyword combination to retrieval server input text form, read the inverted index clauses and subclauses according to search key, the positional information sequence appears in the output keyword
2, a kind of multimedia resource search method according to claim 1 based on the audio content retrieval, it is characterized in that described meaning of one's words text index is that a binary is right, comprised time period starting point and time period endpoint data that text key word, keyword once occur in voice.
3, a kind of multimedia resource search method according to claim 1 based on the audio content retrieval, it is characterized in that described keyword index is that a ternary is right, comprised keyword, the video of keyword correspondence and time period starting point and the time period endpoint data that audio resource document number, keyword once occur in voice.
4, a kind of multimedia resource search method according to claim 1 based on the audio content retrieval, it is characterized in that described result for retrieval is to have described a series of resource files that comprise institute's search key, and in certain resource file, a series of time period information of this keyword sound bite appear.
CNB2008100620738A 2008-05-28 2008-05-28 A kind of multimedia resource search method based on the audio content retrieval Expired - Fee Related CN100565532C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2008100620738A CN100565532C (en) 2008-05-28 2008-05-28 A kind of multimedia resource search method based on the audio content retrieval

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2008100620738A CN100565532C (en) 2008-05-28 2008-05-28 A kind of multimedia resource search method based on the audio content retrieval

Publications (2)

Publication Number Publication Date
CN101281534A CN101281534A (en) 2008-10-08
CN100565532C true CN100565532C (en) 2009-12-02

Family

ID=40014009

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2008100620738A Expired - Fee Related CN100565532C (en) 2008-05-28 2008-05-28 A kind of multimedia resource search method based on the audio content retrieval

Country Status (1)

Country Link
CN (1) CN100565532C (en)

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101883079B (en) * 2009-05-08 2016-01-27 上海聚力传媒技术有限公司 For the method and apparatus of accelerating to request multimedia contents in the Internet
KR100999655B1 (en) * 2009-05-18 2010-12-13 윤재민 Digital video recorder system and application method thereof
CN101996195B (en) * 2009-08-28 2012-07-11 中国移动通信集团公司 Searching method and device of voice information in audio files and equipment
CN102073635B (en) 2009-10-30 2015-08-26 索尼株式会社 Program endpoint time detection apparatus and method and programme information searching system
CN102375834B (en) * 2010-08-17 2016-01-20 腾讯科技(深圳)有限公司 Audio file search method, system and audio file type recognition methods, system
CN102074235B (en) * 2010-12-20 2013-04-03 上海华勤通讯技术有限公司 Method of video speech recognition and search
CN102592628A (en) * 2012-02-15 2012-07-18 张群 Play control method of audio and video play file
CN102750366B (en) * 2012-06-18 2015-05-27 海信集团有限公司 Video search system and method based on natural interactive import and video search server
CN102831213B (en) * 2012-08-16 2015-08-05 广东小天才科技有限公司 A kind of searching method of learning content, device and electronic product
CN102833595A (en) * 2012-09-20 2012-12-19 北京十分科技有限公司 Method and apparatus for transferring information
CN104239328A (en) * 2013-06-18 2014-12-24 联想(北京)有限公司 Multimedia processing method and multimedia system
CN104572712A (en) * 2013-10-18 2015-04-29 英业达科技有限公司 Multimedia file browsing system and multimedia file browsing method
CN104572716A (en) * 2013-10-18 2015-04-29 英业达科技有限公司 System and method for playing video files
CN104618807B (en) * 2014-03-31 2017-11-17 腾讯科技(北京)有限公司 Multi-medium play method, apparatus and system
CN103914530B (en) * 2014-03-31 2017-02-15 北京中科模识科技有限公司 Method and system for monitoring rule-violating advertisements in broadcasting and TV programs
CN103956166A (en) * 2014-05-27 2014-07-30 华东理工大学 Multimedia courseware retrieval system based on voice keyword recognition
CN104105002B (en) * 2014-07-15 2018-12-21 百度在线网络技术(北京)有限公司 The methods of exhibiting and device of audio-video document
CN111757189B (en) * 2014-12-01 2022-07-15 构造数据有限责任公司 System and method for continuous media segment identification
CN104599692B (en) * 2014-12-16 2017-12-15 上海合合信息科技发展有限公司 The way of recording and device, recording substance searching method and device
CN105898204A (en) * 2014-12-25 2016-08-24 支录奎 Intelligent video recorder enabling video structuralization
CN104994400A (en) * 2015-07-06 2015-10-21 无锡天脉聚源传媒科技有限公司 Method and device for indexing video by means of acquisition of host name
CN105336343A (en) * 2015-10-28 2016-02-17 天脉聚源(北京)教育科技有限公司 Information searching method and device
CN105550308B (en) * 2015-12-14 2019-07-26 联想(北京)有限公司 A kind of information processing method, search method and electronic equipment
CN105898498A (en) * 2015-12-15 2016-08-24 乐视网信息技术(北京)股份有限公司 Video synchronization method and system
CN105825849A (en) * 2016-04-06 2016-08-03 普强信息技术(北京)有限公司 Time position keyword hit analysis method based on identification result time boundary
CN105913838B (en) * 2016-05-19 2019-11-05 努比亚技术有限公司 Audio frequency controller device and method
CN106096050A (en) * 2016-06-29 2016-11-09 乐视控股(北京)有限公司 A kind of method and apparatus of video contents search
CN106686401A (en) * 2017-01-13 2017-05-17 山东鑫诚信电子科技有限公司 Video data distributed storage method, video data distributed storage device, video data retrieval method and video data retrieval device
CN107316638A (en) * 2017-06-28 2017-11-03 北京粉笔未来科技有限公司 A kind of poem recites evaluating method and system, a kind of terminal and storage medium
CN107609149B (en) * 2017-09-21 2020-06-19 北京奇艺世纪科技有限公司 Video positioning method and device
CN107818785A (en) * 2017-09-26 2018-03-20 平安普惠企业管理有限公司 A kind of method and terminal device that information is extracted from multimedia file
CN107798143A (en) * 2017-11-24 2018-03-13 珠海市魅族科技有限公司 A kind of information search method, device, terminal and readable storage medium storing program for executing
CN108986792B (en) * 2018-09-11 2021-02-12 苏州思必驰信息科技有限公司 Training and scheduling method and system for voice recognition model of voice conversation platform
CN109785052A (en) * 2018-12-26 2019-05-21 珠海横琴跨境说网络科技有限公司 Smart shopper method and system based on dark data mining
CN109740015A (en) * 2019-01-09 2019-05-10 安徽睿极智能科技有限公司 Magnanimity audio search method based on audio concentration abstract
CN109523990B (en) * 2019-01-21 2021-11-05 未来电视有限公司 Voice detection method and device
CN111723236A (en) * 2019-03-18 2020-09-29 百度在线网络技术(北京)有限公司 Video index establishing method, device, equipment and computer readable medium
CN110351183B (en) * 2019-06-03 2021-06-08 创新先进技术有限公司 Resource collection method and device in instant messaging
CN110232921A (en) * 2019-06-21 2019-09-13 深圳市酷开网络科技有限公司 Voice operating method, apparatus, smart television and system based on service for life
CN111125408B (en) * 2019-10-11 2023-08-29 平安科技(深圳)有限公司 Searching method, searching device, computer equipment and storage medium based on feature extraction
CN110867179A (en) * 2019-11-12 2020-03-06 云南电网有限责任公司德宏供电局 File storage and retrieval method and system based on voice recognition, IKAnalyzer word segmentation and hdfs
CN111429912B (en) * 2020-03-17 2023-02-10 厦门快商通科技股份有限公司 Keyword detection method, system, mobile terminal and storage medium
CN113470627A (en) * 2021-07-02 2021-10-01 因诺微科技(天津)有限公司 MVGG-CTC-based keyword search method
CN113744831A (en) * 2021-08-20 2021-12-03 中国联合网络通信有限公司成都市分公司 Online medical application purchasing system
CN114173191B (en) * 2021-12-09 2024-03-19 上海开放大学 Multi-language answering method and system based on artificial intelligence
CN115129923B (en) * 2022-05-17 2023-10-20 荣耀终端有限公司 Voice searching method, device and storage medium

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
. 叶睿智.韵河(LibSonar):基于音频内容检索的中华历史文化听书馆. 2007
. 叶睿智.韵河(LibSonar):基于音频内容检索的中华历史文化听书馆. 2007 *
一种面向基于内容视频检索的音频场景分割方法. 朱映映等.小型微型计算机系统,第29卷第3期. 2008
一种面向基于内容视频检索的音频场景分割方法. 朱映映等.小型微型计算机系统,第29卷第3期. 2008 *
基于音视特征的视频内容检测方法. 蔡群等.计算机工程,第33卷第22期. 2007
基于音视特征的视频内容检测方法. 蔡群等.计算机工程,第33卷第22期. 2007 *

Also Published As

Publication number Publication date
CN101281534A (en) 2008-10-08

Similar Documents

Publication Publication Date Title
CN100565532C (en) A kind of multimedia resource search method based on the audio content retrieval
Chelba et al. Retrieval and browsing of spoken content
KR101255405B1 (en) Indexing and searching speech with text meta-data
US7542966B2 (en) Method and system for retrieving documents with spoken queries
CN101510222B (en) Multilayer index voice document searching method
EP2252995B1 (en) Method and apparatus for voice searching for stored content using uniterm discovery
Favre et al. Robust named entity extraction from large spoken archives
CN104078044A (en) Mobile terminal and sound recording search method and device of mobile terminal
EP2135180A1 (en) Method and apparatus for distributed voice searching
KR20080069990A (en) Speech index pruning
Zhou et al. Towards spoken-document retrieval for the internet: Lattice indexing for large-scale web-search architectures
CN101593519A (en) Detect method and apparatus and the search method and the system of voice keyword
CN104199825A (en) Information inquiry method and system
CN114547373A (en) Method for intelligently identifying and searching programs based on audio
Wechsler et al. Speech retrieval based on automatic indexing
Alexander et al. Audio features, precomputed for podcast retrieval and information access experiments
Sen et al. Audio indexing
Wang Mandarin spoken document retrieval based on syllable lattice matching
Clements et al. Phonetic searching of digital audio
Charhad et al. Speaker identity indexing in audio-visual documents
Chang et al. Latent semantic retrieval of spoken documents over position specific posterior lattices
Lo et al. Multi-scale spoken document retrieval for Cantonese broadcast news
Sugimoto et al. Effect of document expansion using web documents for spoken documents retrieval
Feng Multilevel structured convolution neural network for speech keyword location and recognition: MSS‐Net
Nishizaki et al. Web page collection using automatic document segmentation for spoken document retrieval

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: HANGZHOU WISEZONE NETWORK CO., LTD.

Free format text: FORMER OWNER: YE RUIZHI

Effective date: 20101220

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 310013 EAST ROOM 326, SCIENCE AND TECHNOLOGY PARK A, ZHEJIANG UNIVERSITY, NO. 525, XIXI ROAD, XIHU DISTRICT, HANGZHOU CITY, ZHEJIANG PROVINCE TO: 310013 3/F, BUILDING 12, XIHU SHUYUAN SOFTWARE PARK, NO. 176, TIANMUSHAN ROAD, XIHU DISTRICT, HANGZHOU CITY, ZHEJIANG PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20101220

Address after: Hangzhou City, Zhejiang province 310013 Xihu District Tianmushan Road No. 176 West Lake soyea Software Park 12 building 3 floor

Patentee after: Hangzhou micro network Co., Ltd.

Address before: 326 A East, room 525, Zhejiang University Science Park, Xixi Road, Xihu District, Zhejiang, Hangzhou 310013, China

Patentee before: Ye Ruizhi

DD01 Delivery of document by public notice

Addressee: Hangzhou micro network Co., Ltd.

Document name: Notification to Pay the Fees

DD01 Delivery of document by public notice

Addressee: Hangzhou micro network Co., Ltd.

Document name: Notification of Termination of Patent Right

C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20091202

Termination date: 20130528