BR112015018597A2 - método e dispositivo para reconhecimento de áudio - Google Patents

método e dispositivo para reconhecimento de áudio

Info

Publication number
BR112015018597A2
BR112015018597A2 BR112015018597A BR112015018597A BR112015018597A2 BR 112015018597 A2 BR112015018597 A2 BR 112015018597A2 BR 112015018597 A BR112015018597 A BR 112015018597A BR 112015018597 A BR112015018597 A BR 112015018597A BR 112015018597 A2 BR112015018597 A2 BR 112015018597A2
Authority
BR
Brazil
Prior art keywords
maximum value
audio
phase
characteristic
audio document
Prior art date
Application number
BR112015018597A
Other languages
English (en)
Inventor
Xiao Bin
Chen Bo
Xie Dadong
Liu Hailong
Hou Jie
Liu Xiao
Original Assignee
Tencent Tech Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Tech Shenzhen Co Ltd filed Critical Tencent Tech Shenzhen Co Ltd
Publication of BR112015018597A2 publication Critical patent/BR112015018597A2/pt

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H5/00Instruments in which the tones are generated by means of electronic generators
    • G10H5/005Voice controlled instruments
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stereophonic System (AREA)
  • Management Or Editing Of Information On Record Carriers (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

resumo da patente de invenção para: "método e dispositivo para reconhecimento de áudio". método e dispositivo para realizar reconhecimento de áudio inclui: recolher um primeiro documento de áudio por reconhecer; iniciar o cálculo da primeira informação de características do primeiro documento de áudio, incluindo: orientar a análise de tempo-frequência para o primeiro documento de áudio para criar um primeiro número pré-definido dos canais de fase; e extrair pelo menos um ponto característico do valor máximo a partir de cada canal de fase do primeiro número pré-definido dos canais de fase, em que o pelo menos um ponto característico do valor máximo de cada canal de fase constitui a sequência de pontos característicos do valor máximo de cada um desses canais de fase; e obter um resultado de reconhecimento para o primeiro documento de áudio, em que o resultado de reconhecimento está identificado com base na primeira informação de característica, e em que a primeira informação de características é calculada com base nas respectivas sequências de pontos característicos do valor máximo do número pré-definido dos canais de fase.
BR112015018597A 2013-02-04 2013-10-16 método e dispositivo para reconhecimento de áudio BR112015018597A2 (pt)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310042408.0A CN103971689B (zh) 2013-02-04 2013-02-04 一种音频识别方法及装置
PCT/CN2013/085309 WO2014117542A1 (en) 2013-02-04 2013-10-16 Method and device for audio recognition

Publications (1)

Publication Number Publication Date
BR112015018597A2 true BR112015018597A2 (pt) 2017-07-18

Family

ID=51241107

Family Applications (1)

Application Number Title Priority Date Filing Date
BR112015018597A BR112015018597A2 (pt) 2013-02-04 2013-10-16 método e dispositivo para reconhecimento de áudio

Country Status (7)

Country Link
JP (1) JP6090881B2 (pt)
KR (1) KR101625944B1 (pt)
CN (1) CN103971689B (pt)
BR (1) BR112015018597A2 (pt)
CA (1) CA2899657C (pt)
TW (1) TWI494917B (pt)
WO (1) WO2014117542A1 (pt)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9324330B2 (en) * 2012-03-29 2016-04-26 Smule, Inc. Automatic conversion of speech into song, rap or other audible expression having target meter or rhythm
US9837068B2 (en) * 2014-10-22 2017-12-05 Qualcomm Incorporated Sound sample verification for generating sound detection model
JP6392450B2 (ja) * 2015-04-13 2018-09-19 日本電信電話株式会社 マッチング装置、判定装置、これらの方法、プログラム及び記録媒体
EP3304251B1 (en) * 2015-06-03 2023-10-11 Razer (Asia-Pacific) Pte. Ltd. Haptics devices and methods for controlling a haptics device
CN105139866B (zh) * 2015-08-10 2018-10-16 泉州师范学院 南音的识别方法及装置
CN106558318B (zh) * 2015-09-24 2020-04-28 阿里巴巴集团控股有限公司 音频识别方法和系统
CN105632513A (zh) * 2015-12-18 2016-06-01 合肥寰景信息技术有限公司 一种网络社区的语音过滤方法
CN105575400A (zh) * 2015-12-24 2016-05-11 广东欧珀移动通信有限公司 一种获取歌曲信息的方法、终端、服务器和系统
CN105589970A (zh) * 2015-12-25 2016-05-18 小米科技有限责任公司 音乐搜索方法和装置
EP3208800A1 (en) * 2016-02-17 2017-08-23 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for stereo filing in multichannel coding
CN105868397B (zh) * 2016-04-19 2020-12-01 腾讯科技(深圳)有限公司 一种歌曲确定方法和装置
CN105825850B (zh) * 2016-04-29 2021-08-24 腾讯科技(深圳)有限公司 一种音频处理方法及装置
CN108205546B (zh) * 2016-12-16 2021-01-12 北京酷我科技有限公司 一种歌曲信息的匹配系统及方法
CN106708465A (zh) * 2016-12-16 2017-05-24 北京小米移动软件有限公司 智能鞋的控制方法及装置
CN110322897B (zh) 2018-03-29 2021-09-03 北京字节跳动网络技术有限公司 一种音频检索识别方法及装置
CN110209872B (zh) * 2019-05-29 2021-06-22 天翼爱音乐文化科技有限公司 片段音频歌词生成方法、装置、计算机设备和存储介质
CN110289013B (zh) * 2019-07-24 2023-12-19 腾讯科技(深圳)有限公司 多音频采集源检测方法、装置、存储介质和计算机设备
CN111161758B (zh) * 2019-12-04 2023-03-31 厦门快商通科技股份有限公司 一种基于音频指纹的听歌识曲方法、系统及音频设备
CN112784098A (zh) * 2021-01-28 2021-05-11 百果园技术(新加坡)有限公司 一种音频搜索方法、装置、计算机设备和存储介质
CN113268630B (zh) * 2021-06-08 2023-03-10 腾讯音乐娱乐科技(深圳)有限公司 一种音频检索方法、设备及介质
CN113836346B (zh) * 2021-09-08 2023-08-08 网易(杭州)网络有限公司 为音频文件生成摘要的方法、装置、计算设备及存储介质
CN115956270A (zh) * 2022-10-10 2023-04-11 广州酷狗计算机科技有限公司 音频处理方法、装置、设备及存储介质
CN115910042B (zh) * 2023-01-09 2023-05-05 百融至信(北京)科技有限公司 识别格式化音频文件的信息种类的方法和装置

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS62159195A (ja) * 1986-01-06 1987-07-15 沖電気工業株式会社 音声パタン作成方法
GR1003625B (el) * 1999-07-08 2001-08-31 Μεθοδος χημικης αποθεσης συνθετων επικαλυψεων αγωγιμων πολυμερων σε επιφανειες κραματων αλουμινιου
US6990453B2 (en) * 2000-07-31 2006-01-24 Landmark Digital Services Llc System and methods for recognizing sound and music signals in high noise and distortion
US7277766B1 (en) * 2000-10-24 2007-10-02 Moodlogic, Inc. Method and system for analyzing digital audio files
EP1504445B1 (en) * 2002-04-25 2008-08-20 Landmark Digital Services LLC Robust and invariant audio pattern matching
SG120121A1 (en) * 2003-09-26 2006-03-28 St Microelectronics Asia Pitch detection of speech signals
US7672838B1 (en) * 2003-12-01 2010-03-02 The Trustees Of Columbia University In The City Of New York Systems and methods for speech recognition using frequency domain linear prediction polynomials to form temporal and spectral envelopes from frequency domain representations of signals
JP2006106535A (ja) * 2004-10-08 2006-04-20 Nippon Telegr & Teleph Corp <Ntt> 音響信号蓄積検索装置、及び音響信号蓄積検索プログラム
US20070195963A1 (en) * 2006-02-21 2007-08-23 Nokia Corporation Measuring ear biometrics for sound optimization
US7921116B2 (en) * 2006-06-16 2011-04-05 Microsoft Corporation Highly meaningful multimedia metadata creation and associations
CN101465122A (zh) * 2007-12-20 2009-06-24 株式会社东芝 语音的频谱波峰的检测以及语音识别方法和系统
CN102053998A (zh) * 2009-11-04 2011-05-11 周明全 一种利用声音方式检索歌曲的方法及系统装置
US8886531B2 (en) * 2010-01-13 2014-11-11 Rovi Technologies Corporation Apparatus and method for generating an audio fingerprint and using a two-stage query
JP5907511B2 (ja) * 2010-06-09 2016-04-26 アデルフォイ リミテッド オーディオメディア認識のためのシステム及び方法
TWI426501B (zh) * 2010-11-29 2014-02-11 Inst Information Industry 旋律辨識方法與其裝置
US8818806B2 (en) * 2010-11-30 2014-08-26 JVC Kenwood Corporation Speech processing apparatus and speech processing method
CN102063904B (zh) * 2010-11-30 2012-06-27 广州酷狗计算机科技有限公司 一种音频文件的旋律提取方法及旋律识别系统
US20120296458A1 (en) * 2011-05-18 2012-11-22 Microsoft Corporation Background Audio Listening for Content Recognition
CN102332262B (zh) * 2011-09-23 2012-12-19 哈尔滨工业大学深圳研究生院 基于音频特征的歌曲智能识别方法

Also Published As

Publication number Publication date
CN103971689B (zh) 2016-01-27
KR101625944B1 (ko) 2016-05-31
CA2899657A1 (en) 2014-08-07
WO2014117542A1 (en) 2014-08-07
CN103971689A (zh) 2014-08-06
JP6090881B2 (ja) 2017-03-08
TWI494917B (zh) 2015-08-01
KR20150108936A (ko) 2015-09-30
JP2016512610A (ja) 2016-04-28
CA2899657C (en) 2017-08-01
TW201432674A (zh) 2014-08-16

Similar Documents

Publication Publication Date Title
BR112015018597A2 (pt) método e dispositivo para reconhecimento de áudio
PH12018502583A1 (en) Method and device for pushing information
SG10201907025VA (en) Method and system for verifying identities
SG11201504973SA (en) Method and system for performing an audio information collection and query
EP3206205A4 (en) Voiceprint information management method and device as well as identity authentication method and system
MY185366A (en) Audio information processing method and device
CL2014002551A1 (es) Metodo para identificar segmento de audio, que comprende las etapas de crear un espectrograma del segmento candidato de audio, crear un mapa candidato de bits huella digital y acustica del espectrograma, comparar el mapa candidato con al menos un mapa conocido de un mensaje de red conocido, si el mapa candidato coincide con un mapa conocido, declarar el calce, y si el candidato no coincide, utilizar un algoritmo de deteccion para analizar el segmento de audio candidato; metodos asociados.
EP3253005A4 (en) Data file registration management system, method, management device, and program
BR112016010947A2 (pt) método de reconhecimento de voz, dispositivo de reconhecimento de voz, e dispositivo eletrônico
DK2984647T3 (da) System og fremgangsmåde til generering af en lydfil
EP3447658A4 (en) SYSTEM, DEVICE AND METHOD FOR PROVIDING INFORMATION
EP2983380A4 (en) USER DETECTING SYSTEM ABOUT SHORT DISTANCE AND SYSTEM AND METHOD FOR PROVIDING INFORMATION THEREWITH
BR112016024885A2 (pt) identificação de intenção de pesquisa
GB201715163D0 (en) Device, system and method to provide an auto-focus caability based on object distance information
HK1217786A1 (zh) 索引產生裝置和方法、以及搜尋裝置和搜尋方法
DK2988804T3 (da) Supplerende anordning til indsamling af oplysninger om brugen af en indsprøjtningsanordning
EP3079266A4 (en) Method, device and system for obtaining channel information
MX2015003137A (es) Metodo para proporcionar sincronizacion en un sistema de adquisicion de datos.
DK3336998T3 (da) Nødstrømsanlæg, omformer til et nødstrømsanlæg samt fremgangsmåde til drift af et nødstrømsanlæg
EP3429167B8 (en) Published information processing method and device, and information publishing system
BR112017009523A2 (pt) método para processamento de sinal digital para mapear um símbolo numa sequência de pulsos, dispositivo para processamento de sinal digital, método para processamento de sinal digital para determinar uma condição de perfuração.
EP3001410A4 (en) MUSIC INTERPRETATION RECORDING SYSTEM, MUSICAL INTERPRETATION RECORDING METHOD, AND MUSIC INSTRUMENT
EP3709183A4 (en) SIMILARITY INDEX CALCULATION DEVICE, SIMILARITY RESEARCH DEVICE AND SIMILARITY INDEX CALCULATION PROGRAM
EP3246662A4 (en) Poi information provision server, poi information provision device, poi information provision system, and program
MX2018014113A (es) Aparato para determinar una información de similitud, método para determinar una información de semejanza, aparato para determinar una información de autocorrelación, aparato para determinar una información de correlación cruzada y programa de computadora.

Legal Events

Date Code Title Description
B06F Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]
B15K Others concerning applications: alteration of classification

Free format text: A CLASSIFICACAO ANTERIOR ERA: G10L 15/30

Ipc: G10L 25/51 (2013.01), G10L 25/18 (2013.01), G10L 1

B07A Application suspended after technical examination (opinion) [chapter 7.1 patent gazette]
B09B Patent application refused [chapter 9.2 patent gazette]
B09B Patent application refused [chapter 9.2 patent gazette]

Free format text: MANTIDO O INDEFERIMENTO UMA VEZ QUE NAO FOI APRESENTADO RECURSO DENTRO DO PRAZO LEGAL