WO2003001508A8 - Método para identificación de secuencias de audio - Google Patents

Método para identificación de secuencias de audio

Info

Publication number
WO2003001508A8
WO2003001508A8 PCT/ES2002/000312 ES0200312W WO03001508A8 WO 2003001508 A8 WO2003001508 A8 WO 2003001508A8 ES 0200312 W ES0200312 W ES 0200312W WO 03001508 A8 WO03001508 A8 WO 03001508A8
Authority
WO
WIPO (PCT)
Prior art keywords
abstract
phase
generated during
audio
representing
Prior art date
Application number
PCT/ES2002/000312
Other languages
English (en)
French (fr)
Other versions
WO2003001508B1 (es
WO2003001508A1 (es
Inventor
I Mont Eloi Batlle
Original Assignee
Uni Pompeu Fabra
I Mont Eloi Batlle
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Uni Pompeu Fabra, I Mont Eloi Batlle filed Critical Uni Pompeu Fabra
Priority to EP02743274A priority Critical patent/EP1439523A1/en
Publication of WO2003001508A1 publication Critical patent/WO2003001508A1/es
Publication of WO2003001508B1 publication Critical patent/WO2003001508B1/es
Publication of WO2003001508A8 publication Critical patent/WO2003001508A8/es

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • G10L19/265Pre-filtering, e.g. high frequency emphasis prior to encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/005Algorithms for electrophonic musical instruments or musical processing, e.g. for automatic composition or resource allocation
    • G10H2250/015Markov chains, e.g. hidden Markov models [HMM], for musical processing, e.g. musical analysis or musical composition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/005Algorithms for electrophonic musical instruments or musical processing, e.g. for automatic composition or resource allocation
    • G10H2250/015Markov chains, e.g. hidden Markov models [HMM], for musical processing, e.g. musical analysis or musical composition
    • G10H2250/021Dynamic programming, e.g. Viterbi, for finding the most likely or most desirable sequence in music analysis, processing or composition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Telephonic Communication Services (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

Comprende las siguientes etapas; 1. preprocesado (602) de la secuencia de audio, comprendiendo las eta pas de eliminaci6n de las frecuencias superiores a un valor predeterminado con un filtro pasa-bajos , y de digitalizaci6n de la serial en un convertidor anal6gico/digital , 2. extracci6n de parametros (301 ), representativos de la secuencia de audio, para obtener un vector de parametros Ot especialmente adaptado al enfoque de identificacion propuesto, 3. calculo de descriptores abstractos (302), representativos del vector de parametros Ot, implementados como Modelos Ocultos de Markov, optimizados mediante el uso de una base de datos de definicion de descriptores abstractos (303) generada durante la ejecucion previa de una primera fase en modo de aprendizaje del metodo, 4. identificaci6n (605) de las secuencias de audio asf tratadas en una base de datos de secuencias de descriptores abstractos (505) generada durante la ejecucion previa de una segunda fase en modo de aprendizaje del metodo, 5. grabaci6n de los resultados (607) obtenidos en la etapa de identificaci6n (605).
PCT/ES2002/000312 2001-06-25 2002-06-25 Método para identificación de secuencias de audio WO2003001508A1 (es)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP02743274A EP1439523A1 (en) 2001-06-25 2002-06-25 Method for multiple access and transmission in a point-to-multipoint system on an electric network

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
ES200101468A ES2190342B1 (es) 2001-06-25 2001-06-25 Metodo para identificacion de secuencias de audio.
ESP0101468 2001-06-25

Publications (3)

Publication Number Publication Date
WO2003001508A1 WO2003001508A1 (es) 2003-01-03
WO2003001508B1 WO2003001508B1 (es) 2004-07-08
WO2003001508A8 true WO2003001508A8 (es) 2004-08-12

Family

ID=8498172

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/ES2002/000312 WO2003001508A1 (es) 2001-06-25 2002-06-25 Método para identificación de secuencias de audio

Country Status (3)

Country Link
EP (1) EP1439523A1 (es)
ES (1) ES2190342B1 (es)
WO (1) WO2003001508A1 (es)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112863541B (zh) * 2020-12-31 2024-02-09 福州数据技术研究院有限公司 一种基于聚类和中值收敛的音频切割方法和系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5937384A (en) * 1996-05-01 1999-08-10 Microsoft Corporation Method and system for speech recognition using continuous density hidden Markov models
US5890111A (en) * 1996-12-24 1999-03-30 Technology Research Association Of Medical Welfare Apparatus Enhancement of esophageal speech by injection noise rejection
CA2216224A1 (en) * 1997-09-19 1999-03-19 Peter R. Stubley Block algorithm for pattern recognition
US6182036B1 (en) * 1999-02-23 2001-01-30 Motorola, Inc. Method of extracting features in a voice recognition system

Also Published As

Publication number Publication date
EP1439523A1 (en) 2004-07-21
WO2003001508B1 (es) 2004-07-08
ES2190342A1 (es) 2003-07-16
ES2190342B1 (es) 2004-11-16
WO2003001508A1 (es) 2003-01-03

Similar Documents

Publication Publication Date Title
EP0982713A3 (en) Voice converter with extraction and modification of attribute data
AU2001289766A1 (en) System and methods for recognizing sound and music signals in high noise and distortion
AU1740801A (en) Methods and apparatuses for signal analysis
WO2003007235A1 (en) Method and apparatus for identifying an unknown work
AU578438B2 (en) Text to speech system
CN107274911A (zh) 一种基于声音特征的相似度分析方法
CA2228948A1 (en) Pattern recognition
US4827519A (en) Voice recognition system using voice power patterns
US5963904A (en) Phoneme dividing method using multilevel neural network
CN110473547A (zh) 一种语音识别方法
WO2003050720A3 (en) Database system having heterogeneous object types
CN101594527B (zh) 从音频视频流中高精度检测模板的两阶段方法
GB2406415A (en) Waveform analysis
WO2003096057A3 (en) Methods and apparatus for radar data processing with filter having reduced number of computation
CN110428835A (zh) 一种语音设备的调节方法、装置、存储介质及语音设备
CN110910865A (zh) 语音转换方法和装置、存储介质及电子装置
JPH069000B2 (ja) 音声情報処理方法
CN109104258A (zh) 一种基于关键词识别的无线电识别方法
WO2003001508A8 (es) Método para identificación de secuencias de audio
CN116612788A (zh) 音频数据的情感识别方法、装置、设备及介质
JPH0237600B2 (es)
CA2151330A1 (en) A speech recognizer
CN106448676A (zh) 一种基于自然语言处理的机器人语音识别系统
Nishi et al. Optimum harmonics tracking filter for auditory scene analysis
CN109788399A (zh) 一种音箱的回声消除方法及系统

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2002743274

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

B Later publication of amended claims

Effective date: 20021209

WWP Wipo information: published in national office

Ref document number: 2002743274

Country of ref document: EP

CFP Corrected version of a pamphlet front page

Free format text: UNDER (54) PUBLISHED TITLE REPLACED BY CORRECT TITLE

WWW Wipo information: withdrawn in national office

Ref document number: 2002743274

Country of ref document: EP

NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP