WO2013028351A3 - Measuring content coherence and measuring similarity of audio sections - Google Patents

Measuring content coherence and measuring similarity of audio sections Download PDF

Info

Publication number
WO2013028351A3
WO2013028351A3 PCT/US2012/049876 US2012049876W WO2013028351A3 WO 2013028351 A3 WO2013028351 A3 WO 2013028351A3 US 2012049876 W US2012049876 W US 2012049876W WO 2013028351 A3 WO2013028351 A3 WO 2013028351A3
Authority
WO
WIPO (PCT)
Prior art keywords
audio
measuring
section
similarity
content
Prior art date
Application number
PCT/US2012/049876
Other languages
French (fr)
Other versions
WO2013028351A2 (en
Inventor
Lie Lu
Mingqing HU
Original Assignee
Dolby Laboratories Licensing Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Laboratories Licensing Corporation filed Critical Dolby Laboratories Licensing Corporation
Priority to US14/237,395 priority Critical patent/US9218821B2/en
Priority to EP12753860.1A priority patent/EP2745294A2/en
Priority to JP2014526069A priority patent/JP5770376B2/en
Publication of WO2013028351A2 publication Critical patent/WO2013028351A2/en
Publication of WO2013028351A3 publication Critical patent/WO2013028351A3/en
Priority to US14/952,820 priority patent/US9460736B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

Embodiments for measuring content coherence and embodiments for measuring content similarity are described. Content coherence between a first audio section and a second audio section is measured. For each audio segment in the first audio section, a predetermined number of audio segments in the second audio section are determined. Content similarity between the audio segment in the first audio section and the determined audio segments is higher than that between the audio segment and all the other audio segments in the second audio section. An average of the content similarity between the audio segment in the first audio section and the determined audio segments is calculated. The content coherence is calculated as an average, the maximum or the minimum of the averages calculated for the audio segments in the first audio section. The content similarity may be calculated based on Dirichlet distribution.
PCT/US2012/049876 2011-08-19 2012-08-07 Measuring content coherence and measuring similarity WO2013028351A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US14/237,395 US9218821B2 (en) 2011-08-19 2012-08-07 Measuring content coherence and measuring similarity
EP12753860.1A EP2745294A2 (en) 2011-08-19 2012-08-07 Measuring content coherence and measuring similarity of audio sections
JP2014526069A JP5770376B2 (en) 2011-08-19 2012-08-07 Content coherence measurement and similarity measurement
US14/952,820 US9460736B2 (en) 2011-08-19 2015-11-25 Measuring content coherence and measuring similarity

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201110243107.5 2011-08-19
CN201110243107.5A CN102956237B (en) 2011-08-19 2011-08-19 The method and apparatus measuring content consistency
US201161540352P 2011-09-28 2011-09-28
US61/540,352 2011-09-28

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US14/237,395 A-371-Of-International US9218821B2 (en) 2011-08-19 2012-08-07 Measuring content coherence and measuring similarity
US14/952,820 Division US9460736B2 (en) 2011-08-19 2015-11-25 Measuring content coherence and measuring similarity

Publications (2)

Publication Number Publication Date
WO2013028351A2 WO2013028351A2 (en) 2013-02-28
WO2013028351A3 true WO2013028351A3 (en) 2013-05-10

Family

ID=47747027

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2012/049876 WO2013028351A2 (en) 2011-08-19 2012-08-07 Measuring content coherence and measuring similarity

Country Status (5)

Country Link
US (2) US9218821B2 (en)
EP (1) EP2745294A2 (en)
JP (2) JP5770376B2 (en)
CN (2) CN102956237B (en)
WO (1) WO2013028351A2 (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103337248B (en) * 2013-05-17 2015-07-29 南京航空航天大学 A kind of airport noise event recognition based on time series kernel clustering
CN103354092B (en) * 2013-06-27 2016-01-20 天津大学 A kind of audio frequency music score comparison method with error detection function
US9424345B1 (en) 2013-09-25 2016-08-23 Google Inc. Contextual content distribution
TWI527025B (en) * 2013-11-11 2016-03-21 財團法人資訊工業策進會 Computer system, audio matching method, and computer-readable recording medium thereof
CN104683933A (en) 2013-11-29 2015-06-03 杜比实验室特许公司 Audio object extraction method
CN103824561B (en) * 2014-02-18 2015-03-11 北京邮电大学 Missing value nonlinear estimating method of speech linear predictive coding model
CN104882145B (en) 2014-02-28 2019-10-29 杜比实验室特许公司 It is clustered using the audio object of the time change of audio object
CN105335595A (en) 2014-06-30 2016-02-17 杜比实验室特许公司 Feeling-based multimedia processing
CN104332166B (en) * 2014-10-21 2017-06-20 福建歌航电子信息科技有限公司 Can fast verification recording substance accuracy, the method for synchronism
CN104464754A (en) * 2014-12-11 2015-03-25 北京中细软移动互联科技有限公司 Sound brand search method
CN104900239B (en) * 2015-05-14 2018-08-21 电子科技大学 A kind of audio real-time comparison method based on Walsh-Hadamard transform
US10535371B2 (en) * 2016-09-13 2020-01-14 Intel Corporation Speaker segmentation and clustering for video summarization
CN110491413B (en) * 2019-08-21 2022-01-04 中国传媒大学 Twin network-based audio content consistency monitoring method and system
CN111445922B (en) * 2020-03-20 2023-10-03 腾讯科技(深圳)有限公司 Audio matching method, device, computer equipment and storage medium
CN111785296B (en) * 2020-05-26 2022-06-10 浙江大学 Music segmentation boundary identification method based on repeated melody
CN112185418B (en) * 2020-11-12 2022-05-17 度小满科技(北京)有限公司 Audio processing method and device
CN112885377A (en) * 2021-02-26 2021-06-01 平安普惠企业管理有限公司 Voice quality evaluation method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1073272A1 (en) * 1999-02-15 2001-01-31 Sony Corporation Signal processing method and video/audio processing device
US6542869B1 (en) * 2000-05-11 2003-04-01 Fuji Xerox Co., Ltd. Method for automatic analysis of audio including music and speech
US20060065106A1 (en) * 2004-09-28 2006-03-30 Pinxteren Markus V Apparatus and method for changing a segmentation of an audio piece
US20080288255A1 (en) * 2007-05-16 2008-11-20 Lawrence Carin System and method for quantifying, representing, and identifying similarities in data streams

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1159704C (en) * 1994-06-13 2004-07-28 松下电器产业株式会社 Signal analyzer
WO2002021879A2 (en) * 2000-09-08 2002-03-14 Harman International Industries, Inc. Digital system to compensate power compression of loudspeakers
CN1168031C (en) * 2001-09-07 2004-09-22 联想(北京)有限公司 Content filter based on text content characteristic similarity and theme correlation degree comparison
JP4125990B2 (en) * 2003-05-01 2008-07-30 日本電信電話株式会社 Search result use type similar music search device, search result use type similar music search processing method, search result use type similar music search program, and recording medium for the program
US8214304B2 (en) * 2005-10-17 2012-07-03 Koninklijke Philips Electronics N.V. Method and device for calculating a similarity metric between a first feature vector and a second feature vector
CN100585592C (en) * 2006-05-25 2010-01-27 北大方正集团有限公司 Similarity measurement method for audio-frequency fragments
CN101563938B (en) * 2006-12-21 2014-05-07 皇家飞利浦电子股份有限公司 A device for and a method of processing audio data
US7979252B2 (en) * 2007-06-21 2011-07-12 Microsoft Corporation Selective sampling of user state based on expected utility
US8842851B2 (en) * 2008-12-12 2014-09-23 Broadcom Corporation Audio source localization system and method
CN101593517B (en) * 2009-06-29 2011-08-17 北京市博汇科技有限公司 Audio comparison system and audio energy comparison method thereof
US8190663B2 (en) * 2009-07-06 2012-05-29 Osterreichisches Forschungsinstitut Fur Artificial Intelligence Der Osterreichischen Studiengesellschaft Fur Kybernetik Of Freyung Method and a system for identifying similar audio tracks
JP4937393B2 (en) * 2010-09-17 2012-05-23 株式会社東芝 Sound quality correction apparatus and sound correction method
US8885842B2 (en) * 2010-12-14 2014-11-11 The Nielsen Company (Us), Llc Methods and apparatus to determine locations of audience members
JP5691804B2 (en) * 2011-04-28 2015-04-01 富士通株式会社 Microphone array device and sound signal processing program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1073272A1 (en) * 1999-02-15 2001-01-31 Sony Corporation Signal processing method and video/audio processing device
US6542869B1 (en) * 2000-05-11 2003-04-01 Fuji Xerox Co., Ltd. Method for automatic analysis of audio including music and speech
US20060065106A1 (en) * 2004-09-28 2006-03-30 Pinxteren Markus V Apparatus and method for changing a segmentation of an audio piece
US20080288255A1 (en) * 2007-05-16 2008-11-20 Lawrence Carin System and method for quantifying, representing, and identifying similarities in data streams

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MATTHEW HOFFMAN ET AL: "Content -based musical similarity computation using the hierarchical Dirichlet Process", ISMIR 2008: PROCEEDINGS OF THE 9TH INT. CONF. ON MUSIC INFORMATION RETRIEVAL, 18 September 2008 (2008-09-18), XP055048191, ISBN: 978-0-61-524849-3, Retrieved from the Internet <URL:http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.139.5356&rep=rep1&type=pdf> [retrieved on 20121218] *
RAUBER ET AL: "Probabilistic distance measures of the Dirichlet and Beta distributions", PATTERN RECOGNITION, ELSEVIER, GB, vol. 41, no. 2, 5 October 2007 (2007-10-05), pages 637 - 645, XP022287768, ISSN: 0031-3203, DOI: 10.1016/J.PATCOG.2007.06.023 *
RON J WEISS ET AL: "Unsupervised Discovery of Temporal Structure in Music", IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, IEEE, US, vol. 5, no. 6, 21 April 2011 (2011-04-21), pages 1240 - 1251, XP011386714, ISSN: 1932-4553, DOI: 10.1109/JSTSP.2011.2145356 *

Also Published As

Publication number Publication date
JP2015232710A (en) 2015-12-24
JP5770376B2 (en) 2015-08-26
US9218821B2 (en) 2015-12-22
CN105355214A (en) 2016-02-24
EP2745294A2 (en) 2014-06-25
JP6113228B2 (en) 2017-04-12
CN102956237A (en) 2013-03-06
CN102956237B (en) 2016-12-07
WO2013028351A2 (en) 2013-02-28
US9460736B2 (en) 2016-10-04
JP2014528093A (en) 2014-10-23
US20160078882A1 (en) 2016-03-17
US20140205103A1 (en) 2014-07-24

Similar Documents

Publication Publication Date Title
WO2013028351A3 (en) Measuring content coherence and measuring similarity of audio sections
WO2012148520A3 (en) Tolerance evaluation with reduced measured points
WO2014059137A3 (en) Autonomic network sentinels
MX339321B (en) Absorbent members having density profile.
EP3032108B8 (en) Centrifugal compressor and turbocharger
MX2011009648A (en) Network status detection.
WO2011156799A3 (en) Detecting state estimation network model data errors
WO2014025619A3 (en) Method and apparatus for optimized representation of variables in neural systems
AP2014007969A0 (en) Face calibration method and system, and computer storage medium
IN2015MN01766A (en)
WO2014118642A3 (en) Methods for analysing the commodity market for electricity
WO2014143969A3 (en) Methods and apparatus to credit usage of mobile devices
HK1218214A1 (en) Methods and systems for dynamic spectrum arbitrage user profile management
WO2014078668A3 (en) Evaluating electronic network devices in view of cost and service level considerations
BR112014000106A8 (en) METHOD AND DEVICE FOR DETECTING SEIZURES
GB201513849D0 (en) Storage management calculator, and storage management method
AU332745S (en) Earphone
WO2012160527A3 (en) Integrity evaluation system in an implantable hearing prosthesis
EP2974429A4 (en) Methods and systems for dynamic spectrum arbitrage
WO2014115115A3 (en) Determining apnea-hypopnia index ahi from speech
GB201005011D0 (en) Over-speed, rough loads and hard landing detection system
EP2974428B8 (en) Methods and systems for dynamic spectrum arbitrage
EP2587447A3 (en) Protecting intellectual property rights across namespaces
EP3051143A4 (en) Centrifugal compressor and supercharger
EP2622313A4 (en) System and method of extending the linear dynamic range of event counting

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12753860

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 14237395

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 2012753860

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2014526069

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE