CN101079044A - Similarity measurement method for audio-frequency fragments - Google Patents

Similarity measurement method for audio-frequency fragments Download PDF

Info

Publication number
CN101079044A
CN101079044A CN 200610080669 CN200610080669A CN101079044A CN 101079044 A CN101079044 A CN 101079044A CN 200610080669 CN200610080669 CN 200610080669 CN 200610080669 A CN200610080669 A CN 200610080669A CN 101079044 A CN101079044 A CN 101079044A
Authority
CN
China
Prior art keywords
audio
frequency
frequency fragments
similarity
fragments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200610080669
Other languages
Chinese (zh)
Other versions
CN100585592C (en
Inventor
彭宇新
房翠华
陈晓鸥
吴於茜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Founder Holdings Development Co ltd
Peking University
Peking University Founder Research and Development Center
Original Assignee
BEIDA FANGZHENG TECHN INST Co Ltd BEIJING
Peking University
Peking University Founder Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIDA FANGZHENG TECHN INST Co Ltd BEIJING, Peking University, Peking University Founder Group Co Ltd filed Critical BEIDA FANGZHENG TECHN INST Co Ltd BEIJING
Priority to CN200610080669A priority Critical patent/CN100585592C/en
Publication of CN101079044A publication Critical patent/CN101079044A/en
Application granted granted Critical
Publication of CN100585592C publication Critical patent/CN100585592C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method of measuring similarity degree among the acoustical frequency fragment. The present technique represents the whole acoustical frequency fragment with the acoustical frequency characteristic owing to not considering the difference of the specific content in the acoustical frequency fragment, so the present technique doesn' t measure the similarity degree of the acoustical frequency content effectively. Pointing to the above problem, the invention divides the acoustical frequency fragment in to two layers: the acoustical frequency unit and the acoustical frequency fragment. In the acoustical frequency unit procedure, the acoustical frequency unit is defined series of acoustical frequency frames with similar acoustic fidelity. Firstly, the acoustical frequency fragment is divided into series of the acoustical frequency units and the similar degree of the acoustical frequency unit is measured; what' s more, in the acoustical frequency fragment procedure, the similar degree of two acoustical frequency fragments is measured and modeled a weighted dimidiate graph based on the measuring result of the acoustical frequency unit; at last, the similar degree of two acoustical frequency fragments is measured with optimum matching. The invention provides higher search accuracy, which shows the important effect of the acoustical search technique in the information retrieval.

Description

The method of measuring similarity between a kind of audio-frequency fragments
Technical field
The invention belongs to the audio retrieval technical field, be specifically related to the method for measuring similarity between a kind of audio-frequency fragments.
Background technology
Along with being on the increase of multimedia document and application, audio analysis and retrieval technique become more and more important, and the audio-frequency fragments retrieval is a kind of important form of above-mentioned technology, it is the given audio-frequency fragments of user, how automatically in audio repository, to retrieve similar audio-frequency fragments, and sort from high to low according to similarity.Existing audio retrieval technology generally is to extract audio frequency characteristics from audio-frequency fragments, utilizes these features to carry out measuring similarity then, and the result retrieves according to tolerance.This method is not because consider the difference of particular content in the audio-frequency fragments, and adopts audio frequency characteristics to represent whole audio-frequency fragments, similarity that therefore can not the valid metric audio content.
(author is J.Gu to the document of delivering at Pacific-Rim Conference on Multimedia in 2004 " DominantFeature Vectors Based Audio Similarity Measure ", L.Lu, R.Cai, H.J.Zhang and J.Yang, the page number is 890-897), a kind of audio frequency characteristics of proper vector and the eigenwert based on the audio frequency characteristics matrix has been proposed: main proper vector (Dominant Feature Vectors).The frame characterizing definition that the document extracts audio fragment becomes a feature frame matrix, calculates the autocorrelation matrix of this matrix then, calculates the proper vector of autocorrelation matrix and eigenwert at last as the audio fragment feature.This method is based on the statistical nature of whole audio fragment, the content change characteristic in therefore can't the description audio segment, thus limited the accuracy of audio retrieval.
Summary of the invention
At the deficiencies in the prior art, the present invention proposes a kind of method of audio-frequency fragments measuring similarity, is used to measure the similarity between the different audio-frequency fragments.
For reaching above purpose, the technical solution used in the present invention is: the method for measuring similarity between a kind of audio-frequency fragments may further comprise the steps:
(1) audio-frequency fragments that respectively will be to be measured is divided into the similar audio unit of a plurality of tonequality;
(2) calculate the similarity between any two audio units in above-mentioned two audio-frequency fragments;
(3) according to the result of (2), measure the similarity between above-mentioned two audio-frequency fragments.
Further, (Bayesian Information Criterion BIC), is divided into the similar audio unit of a plurality of tonequality with audio-frequency fragments to be measured to utilization Bayes information standard.
Further, use following formula to calculate the similarity of two audio units:
Sim(s i,s j)=exp(-Dis tan ce(s i,s j)/2)
Dis tan ce ( s i , s j ) = ( Σ p = 1 n ( f ip - f jp ) 2 ) 1 2
Wherein, s iAnd s jRepresent two audio units, Dis tan ce (s i, s j) expression s iAnd s jThe Euclidean distance of audio frequency characteristics vector.
Further, the proper vector of audio unit is to adopt the mean value of all frame audio frequency characteristics vectors in this audio unit to represent.
What further, the proper vector of audio frame adopted is 13 dimensional feature vectors that logarithm energy and Mel cepstral coefficients are formed.
Further: the similarity concrete steps of measuring between above-mentioned two audio-frequency fragments are:
A: the measuring similarity of two audio-frequency fragments is modeled as a cum rights bipartite graph;
B: the similarity between two audio-frequency fragments of utilization Optimum Matching tolerance;
C: adopt the similarity between two audio-frequency fragments of following formula calculating:
Sim OM ( X , Y ) = Σ ω ij max ( p , q )
∑ ω IjRepresent two maximum similarities that the audio-frequency fragments Optimum Matching obtains, p and q represent the audio unit number of two audio-frequency fragments X and Y respectively.
In addition, the present invention proposes a kind of method of audio-frequency fragments retrieval, this method can be retrieved and audio-frequency fragments like the query piece phase failure more effectively, and sorts from high to low according to similarity, thereby can bring into play the huge effect of audio retrieval technology in information retrieval more fully.
For reaching above purpose, the technical scheme of employing is that a kind of method of audio-frequency fragments retrieval is used for retrieving the audio-frequency fragments similar to the audio-frequency fragments of inquiring about from audio repository, may further comprise the steps:
(1) audio-frequency fragments and the audio-frequency fragments in the audio repository with inquiry is divided into the similar audio unit of a plurality of tonequality;
(2) calculate the similarity of inquiring about between audio-frequency fragments and the audio repository sound intermediate frequency segment sound intermediate frequency unit respectively;
(3) measure similarity between above-mentioned inquiry segment and the audio repository sound intermediate frequency segment respectively;
(4) by similarity from high to low, retrieve and audio-frequency fragments like the query piece phase failure.
Further, (Bayesian Information Criterion BIC), is divided into the similar audio unit of a plurality of tonequality with audio-frequency fragments and the audio-frequency fragments of inquiring about in the audio repository to utilization Bayes information standard.
Further, use following formula to calculate the similarity of two audio units:
Sim(s i,s j)=exp(-Dis tan ce(s i,s j)/2)
Dis tan ce ( s i , s j ) = ( Σ p = 1 n ( f ip - f jp ) 2 ) 1 2
Wherein, s iAnd s jRepresent two audio units, Dis tan ce (s i, s j) expression s iAnd s jThe Euclidean distance of audio frequency characteristics vector; Wherein the proper vector of audio unit is to adopt the mean value of all frame audio frequency characteristics vectors in this audio unit to represent, what the proper vector of audio frame adopted is 13 dimensional feature vectors that logarithm energy and Mel cepstral coefficients are formed.
Further, the similarity concrete steps between tolerance inquiry segment and the audio repository sound intermediate frequency segment are:
A: the measuring similarity of two audio-frequency fragments is modeled as a cum rights bipartite graph;
B: the similarity between two audio-frequency fragments of utilization Optimum Matching tolerance;
C: adopt the similarity between two audio-frequency fragments of following formula calculating:
Sim OM ( X , Y ) = Σ ω ij max ( p , q )
∑ ω IjRepresent two maximum similarities that the audio-frequency fragments Optimum Matching obtains, p and q represent the audio unit number of two audio-frequency fragments X and Y respectively.
Effect of the present invention is: compare with existing method, the present invention can obtain higher retrieval accuracy, thereby gives full play to the huge effect of audio retrieval technology in information retrieval.
Why the present invention has the foregoing invention effect, and its reason is: at prior art problems, the present invention is divided into two levels to the audio-frequency fragments retrieval: audio unit and audio-frequency fragments.In the audio unit stage, it is the similar audio frames of a series of tonequality that the present invention defines audio unit, at first audio-frequency fragments is divided into audio unit one by one, measures the similarity of two audio-frequency fragments sound intermediate frequency unit then; In the audio-frequency fragments stage, based on the tolerance result of audio unit, the measuring similarity of two audio-frequency fragments is modeled as a cum rights bipartite graph, use the similarity of two audio-frequency fragments of Optimum Matching tolerance at last.
Description of drawings
Fig. 1 is a schematic flow sheet of the present invention;
Fig. 2 is the recall ratio contrast synoptic diagram of the present invention and existing 3 kinds of methods;
Fig. 3 is the precision ratio contrast synoptic diagram of the present invention and existing 3 kinds of methods.
Embodiment
The present invention is described in further detail below in conjunction with the drawings and specific embodiments.
As shown in Figure 1, method of the present invention specifically may further comprise the steps:
(1) audio-frequency fragments and the audio-frequency fragments in the audio repository with inquiry is divided into the similar audio unit of tonequality one by one;
(Bayesian Information Criterion BIC), is divided into the similar audio unit of tonequality one by one to audio-frequency fragments at first to use Bayes's information standard.Detailed description about Bayes's information standard, can list of references " Efficient Audio Segmentation Algorithms based on the BIC " [M.Cettolo and M.Vescovi, IEEE International Conference on Acoustics, Speech andSignal Processing, 2003].
(2) calculate the similarity of inquiring about between audio-frequency fragments and the audio repository sound intermediate frequency segment sound intermediate frequency unit respectively;
What the proper vector of audio frame adopted is 13 dimensional feature vectors that logarithm energy and Mel cepstral coefficients are formed, and the proper vector of audio unit is to adopt the mean value of all frame audio frequency characteristics vectors in this audio unit to represent.Use following formula to calculate the similarity of two audio units then:
Sim(s i,s j)=exp(-Dis tan ce(s i,s j)/2)
Dis tan ce ( s i , s j ) = ( Σ p = 1 n ( f ip - f jp ) 2 ) 1 2
Wherein, s iAnd s jRepresent two audio units, Dis tan ce (s i, s j) expression s iAnd s jThe Euclidean distance of audio frequency characteristics vector.
(3) measure similarity between above-mentioned inquiry segment and the audio repository sound intermediate frequency segment respectively;
A: the measuring similarity of two audio-frequency fragments is modeled as a cum rights bipartite graph;
B: the similarity between two audio-frequency fragments of utilization Optimum Matching tolerance;
C: adopt the similarity between two audio-frequency fragments of following formula calculating:
Sim OM ( X , Y ) = Σ ω ij max ( p , q )
∑ ω IjRepresent two maximum similarities that the audio-frequency fragments Optimum Matching obtains, p and q represent the audio unit number of two audio-frequency fragments X and Y.
(4) by similarity from high to low, retrieve and audio-frequency fragments like the query piece phase failure.
Following experimental result shows that compare with existing method, the present invention can obtain higher retrieval accuracy, thereby gives full play to the huge effect of audio retrieval technology in information retrieval.
Set up the database of 1000 audio-frequency fragments in the present embodiment, comprised the fragment of sound of many types, for example animal sound, voice, vehicle sound, machine voice music, the report of a gun etc.In these 1000 audio-frequency fragments, have 500 segments that one or more similar segments are arranged, and other 500 segments have only occurred once.Therefore, 500 audio-frequency fragments of one or more similar segments are arranged, be used, so that verify the correctness of similar audio-frequency fragments retrieval as the inquiry segment.
In order to prove validity of the present invention, we have tested following 4 kinds of methods and have contrasted as experiment:
1, the present invention;
2, (author is J.Gu to the document " Dominant Feature Vectors Based Audio Similarity Measure " delivered at Pacific-Rim Conference on Multimedia of existing method 1:2004, L.Lu, R.Cai, H.J.Zhang and J.Yang, page number 890-897);
3, existing method 2:L 2Distance;
4, have the document " Content-based Indexing and Retrieval-by-Example in Audio " (author is Z.Liu and Q.Huang) that method 3:2000 delivers at IEEE International Conference on Multimedia andExpo now.
13 dimensional feature vectors that above-mentioned 4 kinds of methods, audio frame feature have all adopted logarithm energy and Mel cepstral coefficients to form, therefore, last experimental result can prove superiority of the present invention.The key distinction of these 4 kinds of methods is as shown in table 1:
Table 1: the key distinction of the present invention and existing method
The present invention Existing method 1 Existing method 2 Existing method 3
Segment is represented The audio unit feature Main feature The audio frame feature The audio frame feature
Measuring similarity Audio unit tolerance and audio-frequency fragments tolerance Audio-frequency fragments tolerance Audio-frequency fragments tolerance Audio-frequency fragments tolerance
Measure Optimum Matching Main proper vector The K-L distance L 2Distance
Two kinds of evaluation indexes in the MPEG-7 standardization activity have been adopted in experiment: the average adjusted retrieval order of normalization ANMRR (Average Normalized Modified Retrieval Rank) and recall level average AR (Average Recall).AR is similar to traditional recall ratio (Recall), and ANMRR compares with traditional precision ratio (Precision), not only can reflect correct result for retrieval ratio, and can reflect correct result's arrangement sequence number.The ANMRR value is more little, means that the rank of the correct segment that retrieval obtains is forward more; The AR value is big more, and it is big more to mean that in the individual result for retrieval of preceding K (K is the cutoff value of result for retrieval) similar segment accounts for the ratio of all similar segments.So AR is big more, illustrate that the recall ratio of segment retrieval is good more; ANMRR is more little, illustrates that the accuracy of segment retrieval is high more.Table 2 is that AR and the ANMRR that above-mentioned 4 kinds of methods are retrieved 500 audio-frequency fragments compares.
The contrast and experiment of table 2 the present invention and existing method
The present invention Existing method 1 Existing method 2 Existing method 3
AR 0.72 0.66 0.67 0.66
ANMRR 0.26 0.33 0.32 0.33
As can be seen from Table 2, no matter the present invention is AR, or ANMRR, all obtained than the better effect of existing method, this mainly be because: (1) the present invention proposes the similarity of audio-frequency fragments is based upon on the similarity of audio unit, and audio unit is the similar audio frames of a series of tonequality, and this has guaranteed the validity of audio-frequency fragments measuring similarity; (2) the present invention proposes to use the similarity of Optimum Matching tolerance audio-frequency fragments, and Optimum Matching has the mechanism of coupling one to one, and this has guaranteed the validity of audio-frequency fragments tolerance.
In order further to confirm validity of the present invention, except AR and ANMRR, we have adopted other one group of evaluation index: recall ratio and precision ratio, and they are defined as follows:
The number of relevant segment number/all relevant segments of recall ratio=retrieve
All segment numbers of the relevant segment number of precision ratio=retrieve/retrieve
The result as shown in Figures 2 and 3, no matter the present invention is recall ratio, or precision ratio, has all obtained than the better effect of existing method.Therefore, above-mentioned two class evaluation index: AR and ANMRR, recall ratio and precision ratio, full proof the present invention in audio-frequency fragments retrieval, go out chromatic effect.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.
Annotate: work of the present invention, by grant of national natural science foundation (project approval number: 60503062).

Claims (10)

1, the method for measuring similarity between a kind of audio-frequency fragments is characterized in that, may further comprise the steps:
(1) audio-frequency fragments that respectively will be to be measured is divided into the similar audio unit of a plurality of tonequality;
(2) calculate the similarity between any two audio units in above-mentioned two audio-frequency fragments;
(3) according to the result of (2), measure the similarity between above-mentioned two audio-frequency fragments.
2, the method for measuring similarity between a kind of audio-frequency fragments as claimed in claim 1, it is characterized in that: in the step (1), utilization Bayes information standard (Bayesian Information Criterion, BIC), audio-frequency fragments to be measured is divided into the similar audio unit of a plurality of tonequality.
3, the method for measuring similarity between a kind of audio-frequency fragments as claimed in claim 1 is characterized in that: in the step (2), use following formula to calculate the similarity of two audio units:
Sim(s i,s j)=exp(-Dis tan ce(s i,s j)/2)
Dis tan ce ( s i , s j ) = ( Σ p = 1 n ( f ip - f jp ) 2 ) 1 2
Wherein, s iAnd s jRepresent two audio units, Dis tan ce (s i, s j) expression s iAnd s jThe Euclidean distance of audio frequency characteristics vector.
4, the method for measuring similarity between a kind of audio-frequency fragments as claimed in claim 3 is characterized in that: in the step (2), the proper vector of audio unit is to adopt the mean value of all frame audio frequency characteristics vectors in this audio unit to represent.
5, the method for measuring similarity between a kind of audio-frequency fragments as claimed in claim 4 is characterized in that: in the step (2), what the proper vector of audio frame adopted is 13 dimensional feature vectors that logarithm energy and Mel cepstral coefficients are formed.
6, as the method for measuring similarity between claim 1,2,3, the 4 or 5 described a kind of audio-frequency fragments, it is characterized in that: step (3) is specially:
A: the measuring similarity of two audio-frequency fragments is modeled as a cum rights bipartite graph;
B: the similarity between two audio-frequency fragments of utilization Optimum Matching tolerance;
C: adopt the similarity between two audio-frequency fragments of following formula calculating:
Sim OM ( X , Y ) = Σω ij max ( p , q )
∑ ω IjRepresent two maximum similarities that the audio-frequency fragments Optimum Matching obtains, p and q represent the audio unit number of two audio-frequency fragments X and Y respectively.
7, a kind of method of audio-frequency fragments retrieval is used for retrieving the audio-frequency fragments similar to the audio-frequency fragments of inquiring about from audio repository, it is characterized in that, may further comprise the steps:
(1) audio-frequency fragments and the audio-frequency fragments in the audio repository with inquiry is divided into the similar audio unit of a plurality of tonequality;
(2) calculate the similarity of inquiring about between audio-frequency fragments and the audio repository sound intermediate frequency segment sound intermediate frequency unit respectively;
(3) measure similarity between above-mentioned inquiry segment and the audio repository sound intermediate frequency segment respectively;
(4) by similarity from high to low, retrieve and audio-frequency fragments like the query piece phase failure.
8, the method for a kind of audio-frequency fragments retrieval as claimed in claim 7, it is characterized in that: in the step (), utilization Bayes information standard (Bayesian Information Criterion, BIC), audio-frequency fragments and the audio-frequency fragments in the audio repository with inquiry is divided into the similar audio unit of a plurality of tonequality.
9, audio-frequency fragments search method as claimed in claim 7 is characterized in that: in the step (two), use following formula to calculate the similarity of two audio units:
Sim(s i,s j)=exp(-Dis tan ce(s i,s j)/2)
Dis tan ce ( s i , s j ) = ( Σ p = 1 n ( f ip - f jp ) 2 ) 1 2
Wherein, s iAnd s jRepresent two audio units, Dis tan ce (s i, s j) expression s iAnd s jThe Euclidean distance of audio frequency characteristics vector; The proper vector of audio unit is to adopt the mean value of all frame audio frequency characteristics vectors in this audio unit to represent; What the proper vector of audio frame adopted is 13 dimensional feature vectors that logarithm energy and Mel cepstral coefficients are formed.
10, as claim 7,8 or 9 described audio-frequency fragments search methods, it is characterized in that: step (three) be specially:
A: the measuring similarity of two audio-frequency fragments is modeled as a cum rights bipartite graph;
B: the similarity between two audio-frequency fragments of utilization Optimum Matching tolerance;
C: adopt the similarity between two audio-frequency fragments of following formula calculating:
Sim OM ( X , Y ) = Σω ij max ( p , q )
∑ ω IjRepresent two maximum similarities that the audio-frequency fragments Optimum Matching obtains, p and q represent the audio unit number of two audio-frequency fragments X and Y respectively.
CN200610080669A 2006-05-25 2006-05-25 Similarity measurement method for audio-frequency fragments Expired - Fee Related CN100585592C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200610080669A CN100585592C (en) 2006-05-25 2006-05-25 Similarity measurement method for audio-frequency fragments

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200610080669A CN100585592C (en) 2006-05-25 2006-05-25 Similarity measurement method for audio-frequency fragments

Publications (2)

Publication Number Publication Date
CN101079044A true CN101079044A (en) 2007-11-28
CN100585592C CN100585592C (en) 2010-01-27

Family

ID=38906523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200610080669A Expired - Fee Related CN100585592C (en) 2006-05-25 2006-05-25 Similarity measurement method for audio-frequency fragments

Country Status (1)

Country Link
CN (1) CN100585592C (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101221760B (en) * 2008-01-30 2010-12-22 中国科学院计算技术研究所 Audio matching method and system
CN101980197A (en) * 2010-10-29 2011-02-23 北京邮电大学 Long time structure vocal print-based multi-layer filtering audio frequency search method and device
CN101593517B (en) * 2009-06-29 2011-08-17 北京市博汇科技有限公司 Audio comparison system and audio energy comparison method thereof
CN102314875A (en) * 2011-08-01 2012-01-11 北京百度网讯科技有限公司 Audio file identification method and device
CN102469350A (en) * 2010-11-16 2012-05-23 北大方正集团有限公司 Method, device and system for advertisement statistics
CN102956237A (en) * 2011-08-19 2013-03-06 杜比实验室特许公司 Method and device for measuring content consistency and method and device for measuring similarity
CN104184741A (en) * 2014-09-05 2014-12-03 重庆市汇链信息科技有限公司 Method for distributing massive audio and video data into distribution server
CN104992713A (en) * 2015-05-14 2015-10-21 电子科技大学 Fast audio comparing method
WO2017113973A1 (en) * 2015-12-29 2017-07-06 北京搜狗科技发展有限公司 Method and device for audio identification
CN107609149A (en) * 2017-09-21 2018-01-19 北京奇艺世纪科技有限公司 A kind of video locating method and device
CN108039178A (en) * 2017-12-15 2018-05-15 奕响(大连)科技有限公司 A kind of audio similar determination methods of Fourier transformation time-domain and frequency-domain
CN108091346A (en) * 2017-12-15 2018-05-29 奕响(大连)科技有限公司 A kind of similar determination methods of the audio of Local Fourier Transform
CN111400543A (en) * 2020-03-20 2020-07-10 腾讯科技(深圳)有限公司 Audio segment matching method, device, equipment and storage medium
CN116884437A (en) * 2023-09-07 2023-10-13 北京惠朗时代科技有限公司 Speech recognition processor based on artificial intelligence

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101221760B (en) * 2008-01-30 2010-12-22 中国科学院计算技术研究所 Audio matching method and system
CN101593517B (en) * 2009-06-29 2011-08-17 北京市博汇科技有限公司 Audio comparison system and audio energy comparison method thereof
CN101980197B (en) * 2010-10-29 2012-10-31 北京邮电大学 Long time structure vocal print-based multi-layer filtering audio frequency search method and device
CN101980197A (en) * 2010-10-29 2011-02-23 北京邮电大学 Long time structure vocal print-based multi-layer filtering audio frequency search method and device
CN102469350A (en) * 2010-11-16 2012-05-23 北大方正集团有限公司 Method, device and system for advertisement statistics
CN102314875B (en) * 2011-08-01 2016-04-27 北京音之邦文化科技有限公司 Audio file identification method and device
CN102314875A (en) * 2011-08-01 2012-01-11 北京百度网讯科技有限公司 Audio file identification method and device
CN105355214A (en) * 2011-08-19 2016-02-24 杜比实验室特许公司 Method and equipment for measuring similarity
US9460736B2 (en) 2011-08-19 2016-10-04 Dolby Laboratories Licensing Corporation Measuring content coherence and measuring similarity
CN102956237B (en) * 2011-08-19 2016-12-07 杜比实验室特许公司 The method and apparatus measuring content consistency
CN102956237A (en) * 2011-08-19 2013-03-06 杜比实验室特许公司 Method and device for measuring content consistency and method and device for measuring similarity
CN104184741A (en) * 2014-09-05 2014-12-03 重庆市汇链信息科技有限公司 Method for distributing massive audio and video data into distribution server
CN104992713B (en) * 2015-05-14 2018-11-13 电子科技大学 A kind of quick broadcast audio comparison method
CN104992713A (en) * 2015-05-14 2015-10-21 电子科技大学 Fast audio comparing method
WO2017113973A1 (en) * 2015-12-29 2017-07-06 北京搜狗科技发展有限公司 Method and device for audio identification
CN107609149A (en) * 2017-09-21 2018-01-19 北京奇艺世纪科技有限公司 A kind of video locating method and device
CN107609149B (en) * 2017-09-21 2020-06-19 北京奇艺世纪科技有限公司 Video positioning method and device
CN108091346A (en) * 2017-12-15 2018-05-29 奕响(大连)科技有限公司 A kind of similar determination methods of the audio of Local Fourier Transform
CN108039178A (en) * 2017-12-15 2018-05-15 奕响(大连)科技有限公司 A kind of audio similar determination methods of Fourier transformation time-domain and frequency-domain
CN111400543A (en) * 2020-03-20 2020-07-10 腾讯科技(深圳)有限公司 Audio segment matching method, device, equipment and storage medium
CN111400543B (en) * 2020-03-20 2023-10-10 腾讯科技(深圳)有限公司 Audio fragment matching method, device, equipment and storage medium
CN116884437A (en) * 2023-09-07 2023-10-13 北京惠朗时代科技有限公司 Speech recognition processor based on artificial intelligence
CN116884437B (en) * 2023-09-07 2023-11-17 北京惠朗时代科技有限公司 Speech recognition processor based on artificial intelligence

Also Published As

Publication number Publication date
CN100585592C (en) 2010-01-27

Similar Documents

Publication Publication Date Title
CN101079044A (en) Similarity measurement method for audio-frequency fragments
Mamou et al. System combination and score normalization for spoken term detection
CN1171199C (en) Information retrieval and speech recognition based on language models
US8065293B2 (en) Self-compacting pattern indexer: storing, indexing and accessing information in a graph-like data structure
CN1270361A (en) Method and device for audio information searching by content and loudspeaker information
CN101833986B (en) Method for creating three-stage audio index and audio retrieval method
CN1741132A (en) System and method of lattice-based search for spoken utterance retrieval
CN101079028A (en) On-line translation model selection method of statistic machine translation
US20080313128A1 (en) Disk-Based Probabilistic Set-Similarity Indexes
EP1653381A3 (en) System and method for speeding up database lookups for multiple synchronized data streams
CN1591570A (en) Bubble splitting for compact acoustic modeling
CN105808709A (en) Quick retrieval method and device of face recognition
JP2005309853A5 (en)
CN1794240A (en) Computer information retrieval system based on natural speech understanding and its searching method
CN1940926A (en) Efficient musical database query method based on humming
CN112163145B (en) Website retrieval method, device and equipment based on editing distance and cosine included angle
CN1959671A (en) Measure of similarity of documentation based on document structure
CN1731393A (en) Enterprise information searching method based on key words
CN101030206A (en) Method for discovering and generating search engine key word
CN1916904A (en) Method of abstracting single file based on expansion of file
CN1174374C (en) Method and device for parallelly having speech recognition, classification and segmentation of speaker
Banuroopa et al. MFCC based hybrid fingerprinting method for audio classification through LSTM
CN1271550C (en) Sentence boundary identification method in spoken language dialogue
WO2022161291A1 (en) Audio search method and apparatus, computer device, and storage medium
CN1472726A (en) Device and method for determining coretative coefficient between signals and signal sectional distance

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220915

Address after: 3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031

Patentee after: New founder holdings development Co.,Ltd.

Patentee after: Peking University

Patentee after: PEKING University FOUNDER R & D CENTER

Address before: 100871, fangzheng building, 298 Fu Cheng Road, Beijing, Haidian District

Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before: Peking University

Patentee before: PEKING University FOUNDER R & D CENTER

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100127

CF01 Termination of patent right due to non-payment of annual fee