IN2014MN01588A - - Google Patents

Info

Publication number
IN2014MN01588A
IN2014MN01588A IN1588MUN2014A IN2014MN01588A IN 2014MN01588 A IN2014MN01588 A IN 2014MN01588A IN 1588MUN2014 A IN1588MUN2014 A IN 1588MUN2014A IN 2014MN01588 A IN2014MN01588 A IN 2014MN01588A
Authority
IN
India
Prior art keywords
classification
speech
frame
music
classified
Prior art date
Application number
Inventor
Venkatraman Srinivasa Atti
Ethan Robert Duni
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=48780608&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=IN2014MN01588(A) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of IN2014MN01588A publication Critical patent/IN2014MN01588A/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/20Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0212Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/173Transcoding, i.e. converting between two coded representations avoiding cascaded coding-decoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/22Mode decision, i.e. based on audio signal content versus external parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/81Detection of presence or absence of voice signals for discriminating voice from music

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

Improved audio classification is provided for encoding applications. An initial classification is performed followed by a finer classification to produce speech classifications and music classifications with higher accuracy and less complexity than previously available. Audio is classified as speech or music on a frame by frame basis. If the frame is classified as music by the initial classification that frame undergoes a second finer classification to confirm that the frame is music and not speech (e.g. speech that is tonal and/or structured that may not have been classified as speech by the initial classification). Depending on the implementation one or more parameters may be used in the finer classification. Example parameters include voicing modified correlation signal activity and long term pitch gain.
IN1588MUN2014 2012-01-13 2012-12-21 IN2014MN01588A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201261586374P 2012-01-13 2012-01-13
US13/722,669 US9111531B2 (en) 2012-01-13 2012-12-20 Multiple coding mode signal classification
PCT/US2012/071217 WO2013106192A1 (en) 2012-01-13 2012-12-21 Multiple coding mode signal classification

Publications (1)

Publication Number Publication Date
IN2014MN01588A true IN2014MN01588A (en) 2015-05-08

Family

ID=48780608

Family Applications (1)

Application Number Title Priority Date Filing Date
IN1588MUN2014 IN2014MN01588A (en) 2012-01-13 2012-12-21

Country Status (12)

Country Link
US (1) US9111531B2 (en)
EP (1) EP2803068B1 (en)
JP (1) JP5964455B2 (en)
KR (2) KR20170005514A (en)
CN (1) CN104040626B (en)
BR (1) BR112014017001B1 (en)
DK (1) DK2803068T3 (en)
ES (1) ES2576232T3 (en)
HU (1) HUE027037T2 (en)
IN (1) IN2014MN01588A (en)
SI (1) SI2803068T1 (en)
WO (1) WO2013106192A1 (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9589570B2 (en) 2012-09-18 2017-03-07 Huawei Technologies Co., Ltd. Audio classification based on perceptual quality for low or medium bit rates
PL2922052T3 (en) * 2012-11-13 2021-12-20 Samsung Electronics Co., Ltd. Method for determining an encoding mode
CN106409313B (en) 2013-08-06 2021-04-20 华为技术有限公司 A kind of audio signal classification method and device
CN104424956B9 (en) * 2013-08-30 2022-11-25 中兴通讯股份有限公司 Activation tone detection method and device
EP3109861B1 (en) * 2014-02-24 2018-12-12 Samsung Electronics Co., Ltd. Signal classifying method and device, and audio encoding method and device using same
CN110619891B (en) * 2014-05-08 2023-01-17 瑞典爱立信有限公司 Audio signal discriminator and encoder
CN105336338B (en) * 2014-06-24 2017-04-12 华为技术有限公司 Audio coding method and device
CN106448688B (en) 2014-07-28 2019-11-05 华为技术有限公司 Audio coding method and relevant apparatus
US9886963B2 (en) * 2015-04-05 2018-02-06 Qualcomm Incorporated Encoder selection
CN104867492B (en) * 2015-05-07 2019-09-03 科大讯飞股份有限公司 Intelligent interactive system and method
KR102398124B1 (en) * 2015-08-11 2022-05-17 삼성전자주식회사 Adaptive processing of audio data
US10186276B2 (en) * 2015-09-25 2019-01-22 Qualcomm Incorporated Adaptive noise suppression for super wideband music
WO2017117234A1 (en) * 2016-01-03 2017-07-06 Gracenote, Inc. Responding to remote media classification queries using classifier models and context parameters
US10678828B2 (en) 2016-01-03 2020-06-09 Gracenote, Inc. Model-based media classification service using sensed media noise characteristics
JP6996185B2 (en) * 2017-09-15 2022-01-17 富士通株式会社 Utterance section detection device, utterance section detection method, and computer program for utterance section detection
EP3956890B1 (en) 2019-04-18 2024-02-21 Dolby Laboratories Licensing Corporation A dialog detector
EP4421804A4 (en) * 2021-10-21 2024-10-30 Beijing Xiaomi Mobile Software Co., Ltd. SIGNAL ENCODING AND DECODING METHOD AND APPARATUS, AND ENCODING DEVICE, DECODING DEVICE, AND STORAGE MEDIUM
CN116149499B (en) * 2023-04-18 2023-08-11 深圳雷柏科技股份有限公司 Multi-mode switching control circuit and switching control method for mouse

Family Cites Families (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DK1126437T3 (en) * 1991-06-11 2004-11-08 Qualcomm Inc Variable speed vocoder
US5778335A (en) 1996-02-26 1998-07-07 The Regents Of The University Of California Method and apparatus for efficient multiband celp wideband speech and music coding and decoding
US7072832B1 (en) * 1998-08-24 2006-07-04 Mindspeed Technologies, Inc. System for speech encoding having an adaptive encoding arrangement
US6493665B1 (en) * 1998-08-24 2002-12-10 Conexant Systems, Inc. Speech classification and parameter weighting used in codebook search
US7272556B1 (en) * 1998-09-23 2007-09-18 Lucent Technologies Inc. Scalable and embedded codec for speech and audio signals
US6691084B2 (en) * 1998-12-21 2004-02-10 Qualcomm Incorporated Multiple mode variable rate speech coding
JP2000267699A (en) * 1999-03-19 2000-09-29 Nippon Telegr & Teleph Corp <Ntt> Acoustic signal encoding method and apparatus, program recording medium therefor, and acoustic signal decoding apparatus
CA2348659C (en) * 1999-08-23 2008-08-05 Kazutoshi Yasunaga Apparatus and method for speech coding
US6604070B1 (en) * 1999-09-22 2003-08-05 Conexant Systems, Inc. System of encoding and decoding speech signals
US6625226B1 (en) * 1999-12-03 2003-09-23 Allen Gersho Variable bit rate coder, and associated method, for a communication station operable in a communication system
US6697776B1 (en) * 2000-07-31 2004-02-24 Mindspeed Technologies, Inc. Dynamic signal detector system and method
US6694293B2 (en) 2001-02-13 2004-02-17 Mindspeed Technologies, Inc. Speech coding system with a music classifier
US6785645B2 (en) 2001-11-29 2004-08-31 Microsoft Corporation Real-time speech and music classifier
US6829579B2 (en) * 2002-01-08 2004-12-07 Dilithium Networks, Inc. Transcoding method and system between CELP-based speech codes
US7657427B2 (en) * 2002-10-11 2010-02-02 Nokia Corporation Methods and devices for source controlled variable bit-rate wideband speech coding
US7363218B2 (en) * 2002-10-25 2008-04-22 Dilithium Networks Pty. Ltd. Method and apparatus for fast CELP parameter mapping
FI118834B (en) * 2004-02-23 2008-03-31 Nokia Corp Classification of audio signals
ATE457512T1 (en) * 2004-05-17 2010-02-15 Nokia Corp AUDIO CODING WITH DIFFERENT CODING FRAME LENGTH
US8010350B2 (en) 2006-08-03 2011-08-30 Broadcom Corporation Decimated bisectional pitch refinement
CN1920947B (en) * 2006-09-15 2011-05-11 清华大学 Voice/music detector for audio frequency coding with low bit ratio
CN101197130B (en) * 2006-12-07 2011-05-18 华为技术有限公司 Sound activity detecting method and detector thereof
KR100964402B1 (en) * 2006-12-14 2010-06-17 삼성전자주식회사 Method and apparatus for determining encoding mode of audio signal and method and apparatus for encoding / decoding audio signal using same
KR100883656B1 (en) 2006-12-28 2009-02-18 삼성전자주식회사 Method and apparatus for classifying audio signals and method and apparatus for encoding / decoding audio signals using the same
CN101226744B (en) * 2007-01-19 2011-04-13 华为技术有限公司 Method and device for implementing voice decode in voice decoder
KR100925256B1 (en) * 2007-05-03 2009-11-05 인하대학교 산학협력단 How to classify voice and music in real time
CN101393741A (en) * 2007-09-19 2009-03-25 中兴通讯股份有限公司 Audio signal classification apparatus and method used in wideband audio encoder and decoder
CN101399039B (en) * 2007-09-30 2011-05-11 华为技术有限公司 Method and device for determining non-noise audio signal classification
CN101221766B (en) * 2008-01-23 2011-01-05 清华大学 Method for switching audio encoder
CN101236742B (en) * 2008-03-03 2011-08-10 中兴通讯股份有限公司 Music/ non-music real-time detection method and device
JP5266341B2 (en) 2008-03-03 2013-08-21 エルジー エレクトロニクス インコーポレイティド Audio signal processing method and apparatus
US8768690B2 (en) * 2008-06-20 2014-07-01 Qualcomm Incorporated Coding scheme selection for low-bit-rate applications
EP2301011B1 (en) 2008-07-11 2018-07-25 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Method and discriminator for classifying different segments of an audio signal comprising speech and music segments
EP2144230A1 (en) * 2008-07-11 2010-01-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Low bitrate audio encoding/decoding scheme having cascaded switches
KR101261677B1 (en) * 2008-07-14 2013-05-06 광운대학교 산학협력단 Apparatus for encoding and decoding of integrated voice and music
CN101751920A (en) * 2008-12-19 2010-06-23 数维科技(北京)有限公司 Audio classification and implementation method based on reclassification
CN101814289A (en) * 2009-02-23 2010-08-25 数维科技(北京)有限公司 Digital audio multi-channel coding method and system of DRA (Digital Recorder Analyzer) with low bit rate
JP5519230B2 (en) * 2009-09-30 2014-06-11 パナソニック株式会社 Audio encoder and sound signal processing system
CN102237085B (en) * 2010-04-26 2013-08-14 华为技术有限公司 Method and device for classifying audio signals
HRP20250620T1 (en) 2011-02-15 2025-07-18 Voiceage Evs Llc DEVICE AND METHOD FOR QUANTIZING THE GAIN OF ADAPTIVE AND FIXED EXCITATION CONTRIBUTIONS IN A CELP CODEC

Also Published As

Publication number Publication date
KR20140116487A (en) 2014-10-02
US9111531B2 (en) 2015-08-18
BR112014017001A2 (en) 2017-06-13
EP2803068B1 (en) 2016-04-13
US20130185063A1 (en) 2013-07-18
SI2803068T1 (en) 2016-07-29
CN104040626B (en) 2017-08-11
JP5964455B2 (en) 2016-08-03
JP2015507222A (en) 2015-03-05
DK2803068T3 (en) 2016-05-23
EP2803068A1 (en) 2014-11-19
HUE027037T2 (en) 2016-08-29
WO2013106192A1 (en) 2013-07-18
KR20170005514A (en) 2017-01-13
ES2576232T3 (en) 2016-07-06
BR112014017001A8 (en) 2017-07-04
CN104040626A (en) 2014-09-10
BR112014017001B1 (en) 2020-12-22

Similar Documents

Publication Publication Date Title
IN2014MN01588A (en)
WO2014168934A3 (en) Systems and methods for generating a digital output signal in a digital microphone system
PH12015501587A1 (en) Signaling audio rendering information in a bitstream
MX351359B (en) Encoder, decoder and methods for signal-dependent zoom-transform in spatial audio object coding.
MX2023001960A (en) Multi-channel audio decoder, multi-channel audio encoder, methods and computer program using a residual-signal-based adjustment of a contribution of a decorrelated signal.
EP4246516A3 (en) Device and method for reducing quantization noise in a time-domain decoder
MX2009007412A (en) Audio decoder.
EP4465662A3 (en) Compression of decomposed representations of a sound field
ATE527834T1 (en) ECONOMICAL LOUDNESS MEASUREMENT OF CODED AUDIO
WO2010003109A3 (en) Speech recognition with parallel recognition tasks
EP2846229A3 (en) Systems and methods for generating haptic effects associated with audio signals
IN2015MN01766A (en)
MX2016000908A (en) Apparatus and method for low delay object metadata coding.
MY171188A (en) Systems and methods of performing filtering for gain determination
GB2526727A (en) Systems and methods for using a speaker as a microphone in a mobile device
MY187728A (en) Method and system for encoding audio data with adaptive low frequency compensation
MY165327A (en) Apparatus,method and computer program for providing one or more adjusted parameters for provision of an upmix signal representation on the basis of a downmix signal representation and a parametric side information associated with the downmix signal representation,using an average value
MX352154B (en) Unvoiced/voiced decision for speech processing.
MX2016004923A (en) Concept for encoding an audio signal and decoding an audio signal using speech related spectral shaping information.
EP4358085A3 (en) Signal processing device, method, and program
MX348811B (en) Apparatus and method for spatial audio object coding employing hidden objects for signal mixture manipulation.
MX381836B (en) Encoder, decoder and methods for signal-adaptive switching of the overlap ratio in audio transform coding
TH72937B (en) A terminal for providing one or more tuned parameters. For providing substitution of up-mix signals On the basis of downmix representations Audio decoder Audio codec Audio encoder Bit stream of audio signal And how to use parametric information in relation to objects.
PL402373A1 (en) A way to improve speech intelligibility in a multi-channel multimedia signal, especially video and audio, and a system for implementing the method
MY197703A (en) Audio signal classification method and apparatus