WO2013182085A1 - 一种自适应智能语音装置及方法 - Google Patents

一种自适应智能语音装置及方法 Download PDF

Info

Publication number
WO2013182085A1
WO2013182085A1 PCT/CN2013/077225 CN2013077225W WO2013182085A1 WO 2013182085 A1 WO2013182085 A1 WO 2013182085A1 CN 2013077225 W CN2013077225 W CN 2013077225W WO 2013182085 A1 WO2013182085 A1 WO 2013182085A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
broadcast
parameter
voice parameter
module
Prior art date
Application number
PCT/CN2013/077225
Other languages
English (en)
French (fr)
Inventor
李向阳
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to MYPI2015000965A priority Critical patent/MY177459A/en
Priority to US14/434,934 priority patent/US9552813B2/en
Priority to EP13799812.6A priority patent/EP2908312A4/en
Publication of WO2013182085A1 publication Critical patent/WO2013182085A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use

Definitions

  • the present invention relates to the field of intelligent voice, and in particular, to an adaptive intelligent voice device and method.
  • the current intelligent voice application mainly includes three functional modules: a voice recognition module, a recognition result processing module, and a voice broadcast module:
  • the speech recognition module is configured to extract parameters representing human speech, convert the vocabulary content in the human speech into a machine language file, such as a secondary code file, according to the speech parameter, and send the machine language file to the recognition result.
  • Processing module parameters for characterizing human speech mainly include formants (frequency, bandwidth, amplitude) and pitch frequency.
  • a recognition result processing module configured to perform a corresponding operation according to the machine language file, and send the operation result to the voice broadcast module; if the received vocabulary content of the machine language file is "Where am I", the recognition result processing module will The positioning module obtains the current location of the user, and sends the location information to the voice broadcast module;
  • the voice broadcast module is configured to convert the operation result sent by the recognition result processing module into an audio file for broadcasting in combination with the broadcast voice parameter.
  • the broadcast voice parameter is either given an option for the user to select and determine, or is solidified in the voice broadcast module when the product is shipped.
  • the former due to user differences, different users may need to reset the value of the voice broadcast voice parameter when using it, which brings the user The complexity and cumbersomeness of the use; for the latter, since all users use the same voice to broadcast, resulting in a single and boring user experience.
  • the embodiment of the present invention provides the following technical solutions:
  • An adaptive intelligent voice device includes a voice recognition module, a recognition result processing module, and a voice broadcast module, wherein the device further includes a broadcast voice parameter generating module, and the broadcast voice parameter generating module is configured to:
  • the voice recognition module acquires the extracted voice parameters, and generates a broadcast voice parameter according to the voice parameter and the preset policy, and inputs the broadcast voice parameter to the voice broadcast module.
  • the broadcast voice parameter generating module is further configured to: obtain the voice parameter from the voice recognition module after receiving a specific trigger signal or when the device is powered on.
  • the preset policy includes a correspondence between the voice parameter and the broadcast voice parameter.
  • the broadcast voice parameter generating module is configured to generate the broadcast voice parameter according to the voice parameter and the preset policy in the following manner:
  • An adaptive intelligent voice method comprising:
  • the voice parameters are extracted from the voice by voice recognition, the voice parameters are generated according to the voice parameters and the preset policy;
  • the broadcast voice is generated by the broadcast voice parameter.
  • the step of generating the broadcast voice parameter according to the voice parameter and the preset policy includes: The broadcast voice parameter is generated according to the voice parameter and the preset policy after receiving a specific trigger signal or when powering up.
  • the preset policy includes a correspondence between the voice parameter and the broadcast voice parameter.
  • the step of generating the broadcast voice parameter according to the voice parameter and the preset policy includes:
  • FIG. 1 is a block diagram of an adaptive intelligent voice device according to an embodiment of the present invention.
  • FIG. 2 is a flowchart of an adaptive intelligent voice method according to an embodiment of the present invention.
  • the device module includes a voice recognition module 101, a recognition result processing module 102, a voice broadcast module 103, and a broadcast voice parameter generation module 104.
  • the voice recognition module 101, the recognition result processing module 102, and the voice broadcast module 103 have been implemented in the related art, and are not described here.
  • the broadcast voice parameter generating module 104 is a module added to the related art in this embodiment;
  • the broadcast voice parameter generation module 104 is configured to: obtain and extract from the voice recognition module 101. The voice parameter, and according to the extracted voice parameter and the preset policy, the voice parameter is generated, and the voice parameter is input to the voice broadcast module 103;
  • the preset strategy provides a correspondence between an input parameter and an output parameter, wherein the input parameter is an extracted voice parameter, and the output parameter is a broadcast voice parameter;
  • the correspondence may be a simple numerical correspondence or a complex Algorithmic operation
  • the broadcast voice parameter generating module 104 determines the value of the broadcast voice parameter corresponding to the value of the extracted voice parameter by using the preset policy, and further obtains the broadcast voice parameter;
  • the preset strategy may be: when the input extracted voice parameter represents a male voice, the output broadcast voice parameter is characterized as a female voice;
  • the output broadcast voice parameter is characterized as a heavy sound
  • the output of the broadcast speech parameter characterizes the sound velocity at the same level as the input extracted speech parameter characterized by the sound velocity
  • the sound level of the output of the broadcast voice parameter is at the same level as the sound level of the input voice parameter
  • the broadcast voice parameter generation module 104 may obtain the extracted from the voice recognition module 101 after receiving a specific trigger signal (eg, receiving an open adaptive smart voice indication signal from the user) or when the device is powered on. Voice parameters.
  • the voice parameter used in the broadcast voice takes into account the voice parameter input by the user, and the voice sound is adaptively changed according to the differentiated user voice feature.
  • the effect compared with the current related technology, not only reduces the complexity of setting frequent voice announcements by different users, but also improves the flexibility and vividness of voice broadcast, and can greatly improve the comfort of the user human-computer interaction experience.
  • FIG. 2 is a flowchart of an adaptive intelligent voice method according to an embodiment of the present invention. As shown in FIG. 2, the method mainly includes the following steps:
  • S201 extracting a voice parameter from the voice by using voice recognition; 5202. Generate a broadcast voice parameter according to the extracted voice parameter and a preset policy. In this step, after receiving a specific trigger signal (such as receiving an open adaptive intelligent voice indication signal from the user) or powering on, according to Extracting voice parameters and preset policies to generate broadcast voice parameters;
  • a specific trigger signal such as receiving an open adaptive intelligent voice indication signal from the user
  • the preset policy includes a correspondence between the extracted voice parameter and the broadcast voice parameter: where the input parameter is the extracted voice parameter, and the output parameter is the broadcast voice parameter; the corresponding relationship may be a simple numerical correspondence. Can also be used for complex algorithm operations;
  • the preset strategy may be: when the input extracted voice parameter represents a male voice, the output broadcast voice parameter is characterized as a female voice;
  • the output broadcast voice parameter is characterized as a heavy sound
  • the output of the broadcast speech parameter characterizes the sound velocity at the same level as the input extracted speech parameter characterized by the sound velocity
  • the sound level of the output of the broadcast voice parameter is at the same level as the sound level of the input voice parameter
  • each module/unit in the foregoing embodiment may be implemented in the form of hardware, or may use software functions.
  • the form of the module is implemented. The invention is not limited to any specific form of combination of hardware and software.
  • the foregoing technical solution establishes a connection between the broadcast voice parameter and the user input voice parameter through a preset strategy, and avoids the shortcoming caused by the fixed voice data without using the voice feature of the broadcast voice parameter; Human intervention is required, which brings convenience to users.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)
  • Circuits Of Receivers In General (AREA)

Abstract

一种自适应智能语音装置及方法,其中所述装置包括语音识别模块(101)、识别结果处理模块(102)、语音播报模块(103)以及播报语音参数生成模块(104),该播报语音参数生成模块(104)设置成:从所述语音识别模块(101)获取提取的语音参数,并根据该提取的语音参数以及预设策略生成播报语音参数,并将该播报语音参数输入至所述语音播报模块(103),上述技术方案使得播报语音更加匹配用户语音。

Description

一种自适应智能语音装置及方法
技术领域
本发明涉及智能语音领域, 尤其涉及一种自适应智能语音装置及方法。
背景技术
随着移动通信技术和手机制造技术的发展, 智能手机以较高的性能、 支 持多种业务以及不断下降的成本受到越来越多的消费者的青睐。 随着智能手机硬件性能的提高和操作系统功能的强大, 越来越多的智能 应用能够得以实现,其中包括智能语音业务。相对于传统的手动式人机交互, 智能语音由于其更人性化和便捷性的交互方式受到越来越多用户的喜爱, 并 且在 apple和 android等智能手机平台上都相继出现了诸如 siri等一系列智能 语音应用程序。
目前的智能语音应用主要包括三个功能模块: 语音识别模块、 识别结果 处理模块以及语音播报模块:
其中, 语音识别模块, 用于提取表征人类语音的参数, 根据该语音参数 将人类语音中的词汇内容转换为机器语言文件, 如二级制代码文件等, 并将 该机器语言文件发送至识别结果处理模块; 表征人类语音的参数主要包括共 振峰(频率、 带宽、 幅度)和基音频率等。
识别结果处理模块, 用于根据机器语言文件执行相应的操作, 并将操作 结果发送至语音播报模块; 如接收到的机器语言文件表示的词汇内容为 "我 在哪里" , 识别结果处理模块会从定位模块中获取用户当前的位置, 并将该 位置信息发送至语音播^艮模块;
语音播报模块, 用于结合播报语音参数将识别结果处理模块发送的操作 结果转换为音频文件进行播报。
相关技术中, 播报语音参数要么是给出选项让用户自己选择确定, 要么 是在产品出厂时固化在语音播 ^艮模块中。 针对前者, 由于用户的差异, 不同 用户在使用时可能需要重新设置语音播报语音参数的数值, 给用户带来了使 用上的复杂性和繁瑣性; 对于后者, 由于对所有用户均釆用相同的语音进行 播报, 从而造成了用户体验上的单一和枯燥。 发明内容
本发明的实施例的目的在于提供了一种自适应地智能语音装置及方法, 以解决如何使播 4艮语音更加匹配用户语音的技术问题。
为解决上述技术问题, 本发明实施例提供了如下技术方案:
一种自适应智能语音装置, 该装置包括语音识别模块、 识别结果处理模 块以及语音播报模块, 其中, 所述装置还包括播报语音参数生成模块, 所述播报语音参数生成模块设置成: 从所述语音识别模块获取提取的语 音参数, 并根据所述语音参数以及预设策略生成播报语音参数, 并将所述播 报语音参数输入至所述语音播报模块。
可选地, 所述播报语音参数生成模块还设置成: 在接收到特定的触发信 号后或在该装置上电时, 从所述语音识别模块获取所述语音参数。
可选地, 所述预设策略包含了所述语音参数与所述播报语音参数的对应 关系。
可选地, 所述播报语音参数生成模块设置成按照以下方式根据所述语音 参数以及预设策略生成播报语音参数:
获得所述语音参数的数值, 通过所述预设策略确定与所述语音参数的数 值对应的所述播报语音参数的数值。
一种自适应智能语音方法, 该方法包括:
通过语音识别从声音中提取语音参数后, 根据所述语音参数以及预设策 略生成播报语音参数;
以所述播报语音参数生成播报语音。
可选地, 所述根据所述语音参数以及预设策略生成播报语音参数的步骤 包括: 当接收到特定的触发信号后或上电时根据所述语音参数以及所述预设策 略生成所述播报语音参数。
可选地, 所述预设策略包含了所述语音参数与所述播报语音参数的对应 关系。
可选地, 所述根据所述语音参数以及预设策略生成播报语音参数的步骤 包括:
获得所述语音参数的数值, 通过所述预设策略确定与所述语音参数的数 值对应的所述播报语音参数的数值。
上述技术方案通过预设策略建立了播报语音参数与用户输入语音参数的 联系,避免了播报语音参数不考虑用户语音特征而釆用固定数据造成的不足; 另外上述技术方案生成播报语音参数的动作不需要人工参与, 给用户带来了 使用上的便利。 附图概述
图 1为本发明实施例的自适应智能语音装置模块图;
图 2为本发明实施例的自适应智能语音方法流程图。
本发明的较佳实施方式
为使本发明的目的、 技术方案和优点更加清楚明白, 下文中将结合附图 对本发明的实施例进行详细说明。 需要说明的是, 在不冲突的情况下, 本申 请中的实施例及实施例中的特征可以相互任意组合。
图 1为本发明实施例的自适应智能语音装置模块图, 如图所示, 该装置 模块包括语音识别模块 101、识别结果处理模块 102、语音播报模块 103以及 播报语音参数生成模块 104。 其中, 语音识别模块 101、 识别结果处理模块 102及语音播报模块 103在相关技术中已经实现, 在此不再赘述, 播报语音 参数生成模块 104为本实施例相对于相关技术新增的模块;
播报语音参数生成模块 104设置成: 从所述语音识别模块 101获取提取 的语音参数, 并根据该提取的语音参数以及预设策略生成播报语音参数, 并 将该播报语音参数输入至所述语音播报模块 103;
所述预设策略给出了一种输入参数与输出参数的对应关系, 其中输入参 数为提取的语音参数, 输出参数为播报语音参数; 该对应关系可以为简单的 数值对应关系, 也可以为复杂的算法运算;
播报语音参数生成模块 104在获得提取的语音参数的数值后, 通过该预 设策略确定与提取的语音参数的数值对应的播报语音参数的数值, 进而得到 播报语音参数;
所述预设策略可以为: 当输入的提取的语音参数表征的是男性声音时, 输出的播报语音参数表征为女性声音;
当输入的提取的语音参数表征的是童声时, 输出的播报语音参数表征为 重声;
输出的播报语音参数表征的声音速度与输入的提取的语音参数表征的声 音速度处于相同的等级;
输出的播报语音参数表征的声音响度与输入的提取的语音参数表征的声 音响度处于相同等级;
该播报语音参数生成模块 104可以在接收到特定的触发信号后 (如接收 到来自用户的开启自适应智能语音指示信号)或在所述装置上电时, 从所述 语音识别模块 101获取提取的语音参数。
上述实施例通过在智能语音装置中设置播报语音参数生成模块 104, 使 得播报语音时釆用的语音参数考虑了用户输入的语音参数, 实现了根据差异 化的用户声音特征自适应地改变播报声音的效果, 相对于目前的相关技术, 既减少了不同用户频繁对语音播报进行设置的复杂性, 也提高了语音播报的 灵活性和生动性, 能够极大提高用户人机交互体验的舒适度。
图 2为本发明实施例的自适应智能语音方法流程图, 如图 2所示, 该方 法主要包括如下步骤:
S201 , 通过语音识别从声音中提取语音参数; 5202 , 根据提取的语音参数以及预设策略生成播报语音参数; 该步骤中, 可以在接收到特定的触发信号后 (如接收到来自用户的开启 自适应智能语音指示信号)或上电时, 根据提取的语音参数以及预设策略生 成播报语音参数;
所述预设策略包含了所述提取的语音参数与所述播报语音参数的对应关 系: 其中输入参数为提取的语音参数, 输出参数为播报语音参数; 该对应关 系可以为简单的数值对应关系, 也可以为复杂的算法运算;
在获得提取的语音参数的数值后, 通过该预设策略确定与提取的语音参 数的数值对应的播报语音参数的数值, 进而得到播报语音参数;
所述预设策略可以为: 当输入的提取的语音参数表征的是男性声音时, 输出的播报语音参数表征为女性声音;
当输入的提取的语音参数表征的是童声时, 输出的播报语音参数表征为 重声;
输出的播报语音参数表征的声音速度与输入的提取的语音参数表征的声 音速度处于相同的等级;
输出的播报语音参数表征的声音响度与输入的提取的语音参数表征的声 音响度处于相同等级;
5203 , 以该播报语音参数生成播报语音。
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序 来指令相关硬件完成, 所述程序可以存储于计算机可读存储介质中, 如只读 存储器、 磁盘或光盘等。 可选地, 上述实施例的全部或部分步骤也可以使用 一个或多个集成电路来实现, 相应地, 上述实施例中的各模块 /单元可以釆用 硬件的形式实现, 也可以釆用软件功能模块的形式实现。 本发明不限制于任 何特定形式的硬件和软件的结合。
需要说明的是, 本发明还可有其他多种实施例, 在不背离本发明精神及 和变形, 但这些相应的改变和变形都应属于本发明所附的权利要求的保护范 围。
工业实用性
上述技术方案通过预设策略建立了播报语音参数与用户输入语音参数的 联系,避免了播报语音参数不考虑用户语音特征而釆用固定数据造成的不足; 另外上述技术方案生成播报语音参数的动作不需要人工参与, 给用户带来了 使用上的便利。

Claims

权 利 要 求 书
1、 一种自适应智能语音装置, 该装置包括语音识别模块(101 ) 、 识别 结果处理模块(102 ) 以及语音播报模块(103 ) , 其中, 所述装置还包括播 报语音参数生成模块( 104 ) ,
所述播报语音参数生成模块 ( 104 )设置成:从所述语音识别模块 ( 101 ) 获取提取的语音参数,并根据所述语音参数以及预设策略生成播报语音参数, 并将所述播报语音参数输入至所述语音播报模块( 103 ) 。
2、如权利要求 1所述的自适应智能语音装置, 其中, 所述播报语音参数 生成模块(104 )还设置成: 在接收到特定的触发信号后或在该装置上电时, 从所述语音识别模块( 101 )获取所述语音参数。
3、 如权利要求 1或 2所述的自适应智能语音装置, 其中,
所述预设策略包含了所述语音参数与所述播报语音参数的对应关系。
4、如权利要求 3所述的自适应智能语音装置, 其中, 所述播报语音参数 生成模块( 104 )设置成按照以下方式根据所述语音参数以及预设策略生成播 报语音参数:
获得所述语音参数的数值, 通过所述预设策略确定与所述语音参数的数 值对应的所述播报语音参数的数值。
5、 一种自适应智能语音方法, 该方法包括:
通过语音识别从声音中提取语音参数 ( S201 )后, 根据所述语音参数以 及预设策略生成播报语音参数 ( S202 ) ;
以所述播报语音参数生成播报语音(S203 ) 。
6、如权利要求 5所述的自适应智能语音方法, 其中, 所述根据所述语音 参数以及预设策略生成播报语音参数的步骤(S202 ) 包括:
当接收到特定的触发信号后或上电时根据所述语音参数以及所述预设策 略生成所述播报语音参数。
7、 如权利要求 5或 6所述的自适应智能语音方法, 其中,
所述预设策略包含了所述语音参数与所述播报语音参数的对应关系。
8、如权利要求 7所述的自适应智能语音方法, 其中, 所述根据所述语音 参数以及预设策略生成播报语音参数的步骤(S202 ) 包括:
获得所述语音参数的数值, 通过所述预设策略确定与所述语音参数的数 值对应的所述播报语音参数的数值。
PCT/CN2013/077225 2012-10-12 2013-06-14 一种自适应智能语音装置及方法 WO2013182085A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
MYPI2015000965A MY177459A (en) 2012-10-12 2013-06-14 Self-adaptive intelligent voice device and method
US14/434,934 US9552813B2 (en) 2012-10-12 2013-06-14 Self-adaptive intelligent voice device and method
EP13799812.6A EP2908312A4 (en) 2012-10-12 2013-06-14 SELF-ADAPTIVE INTELLIGENT VOICE DEVICE AND METHOD

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210385273.3 2012-10-12
CN201210385273.3A CN103730117A (zh) 2012-10-12 2012-10-12 一种自适应智能语音装置及方法

Publications (1)

Publication Number Publication Date
WO2013182085A1 true WO2013182085A1 (zh) 2013-12-12

Family

ID=49711396

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/077225 WO2013182085A1 (zh) 2012-10-12 2013-06-14 一种自适应智能语音装置及方法

Country Status (5)

Country Link
US (1) US9552813B2 (zh)
EP (1) EP2908312A4 (zh)
CN (1) CN103730117A (zh)
MY (1) MY177459A (zh)
WO (1) WO2013182085A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657492A (zh) * 2015-03-06 2015-05-27 蔡伟英 一种基于语音识别的设置项检索方法及系统
CN105304081A (zh) * 2015-11-09 2016-02-03 上海语知义信息技术有限公司 一种智能家居的语音播报系统及语音播报方法
CN106128478B (zh) * 2016-06-28 2019-11-08 北京小米移动软件有限公司 语音播报方法及装置
US10157607B2 (en) 2016-10-20 2018-12-18 International Business Machines Corporation Real time speech output speed adjustment
CN110085225B (zh) * 2019-04-24 2024-01-02 北京百度网讯科技有限公司 语音交互方法、装置、智能机器人及计算机可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004198456A (ja) * 2002-12-16 2004-07-15 Toyota Central Res & Dev Lab Inc 音響モデル学習装置
CN1662955A (zh) * 2002-04-22 2005-08-31 松下电器产业株式会社 借助压缩分配和定域格式存取的大词汇量语音识别的模式匹配
CN1811911A (zh) * 2005-01-28 2006-08-02 北京捷通华声语音技术有限公司 自适应的语音变换处理方法
US20060184370A1 (en) * 2005-02-15 2006-08-17 Samsung Electronics Co., Ltd. Spoken dialogue interface apparatus and method
CN102237082A (zh) * 2010-05-05 2011-11-09 三星电子株式会社 语音识别系统的自适应方法

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3900580B2 (ja) * 1997-03-24 2007-04-04 ヤマハ株式会社 カラオケ装置
JPH10304068A (ja) * 1997-04-30 1998-11-13 Nec Corp 音声情報交換方式
JP3728177B2 (ja) * 2000-05-24 2005-12-21 キヤノン株式会社 音声処理システム、装置、方法及び記憶媒体
KR20030006308A (ko) * 2001-07-12 2003-01-23 엘지전자 주식회사 이동통신 단말기의 음성 변조 장치 및 방법
JP2004226672A (ja) * 2003-01-22 2004-08-12 Omron Corp 音楽データ生成システム、サーバ装置、および音楽データ生成方法
JP3984207B2 (ja) * 2003-09-04 2007-10-03 株式会社東芝 音声認識評価装置、音声認識評価方法、及び音声認識評価プログラム
CN100440314C (zh) * 2004-07-06 2008-12-03 中国科学院自动化研究所 基于语音分析与合成的高品质实时变声方法
KR100695127B1 (ko) * 2004-10-08 2007-03-14 삼성전자주식회사 다 단계 음성 인식 장치 및 방법
JP4241736B2 (ja) 2006-01-19 2009-03-18 株式会社東芝 音声処理装置及びその方法
KR100809368B1 (ko) * 2006-08-09 2008-03-05 한국과학기술원 성대파를 이용한 음색 변환 시스템
CN102084386A (zh) * 2008-03-24 2011-06-01 姜旻秀 利用数字内容关联元信息的关键字广告方法及其关联系统
KR20110066404A (ko) * 2009-12-11 2011-06-17 주식회사 에스원 모바일 기기를 이용한 자가 호출 및 긴급 신고 방법, 그 시스템 및 이를 기록한 기록매체
KR20110083027A (ko) * 2010-01-13 2011-07-20 주식회사 에스원 모바일 단말의 시간 설정을 통한 자가 호출 및 긴급 신고 방법, 그 시스템 및 이를 기록한 기록매체
CN102473416A (zh) * 2010-06-04 2012-05-23 松下电器产业株式会社 音质变换装置及其方法、元音信息制作装置及音质变换系统
CN102004624B (zh) * 2010-11-11 2012-08-22 中国联合网络通信集团有限公司 语音识别控制系统和方法
CN102568472A (zh) * 2010-12-15 2012-07-11 盛乐信息技术(上海)有限公司 说话人可选的语音合成系统及其实现方法
US9369307B2 (en) * 2011-07-12 2016-06-14 Bank Of America Corporation Optimized service integration
US9015320B2 (en) * 2011-07-12 2015-04-21 Bank Of America Corporation Dynamic provisioning of service requests
US8719919B2 (en) * 2011-07-12 2014-05-06 Bank Of America Corporation Service mediation framework

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1662955A (zh) * 2002-04-22 2005-08-31 松下电器产业株式会社 借助压缩分配和定域格式存取的大词汇量语音识别的模式匹配
JP2004198456A (ja) * 2002-12-16 2004-07-15 Toyota Central Res & Dev Lab Inc 音響モデル学習装置
CN1811911A (zh) * 2005-01-28 2006-08-02 北京捷通华声语音技术有限公司 自适应的语音变换处理方法
US20060184370A1 (en) * 2005-02-15 2006-08-17 Samsung Electronics Co., Ltd. Spoken dialogue interface apparatus and method
CN102237082A (zh) * 2010-05-05 2011-11-09 三星电子株式会社 语音识别系统的自适应方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2908312A4 *

Also Published As

Publication number Publication date
EP2908312A1 (en) 2015-08-19
MY177459A (en) 2020-09-16
EP2908312A4 (en) 2015-12-02
CN103730117A (zh) 2014-04-16
US9552813B2 (en) 2017-01-24
US20150262579A1 (en) 2015-09-17

Similar Documents

Publication Publication Date Title
JP6489563B2 (ja) 音量調節方法、システム、デバイス及びプログラム
KR101786533B1 (ko) 멀티 레벨 음성 인식
CN110413250B (zh) 一种语音交互方法、装置及系统
CN105847099B (zh) 基于人工智能的物联网实现系统和方法
US11145311B2 (en) Information processing apparatus that transmits a speech signal to a speech recognition server triggered by an activation word other than defined activation words, speech recognition system including the information processing apparatus, and information processing method
WO2013182085A1 (zh) 一种自适应智能语音装置及方法
JP6783339B2 (ja) 音声を処理する方法及び装置
CN109637548A (zh) 基于声纹识别的语音交互方法及装置
US10290292B2 (en) Noise control method and device
CN109473095A (zh) 一种智能家居控制系统及控制方法
JP5753212B2 (ja) 音声認識システム、サーバ、および音声処理装置
CN103366744B (zh) 基于语音控制便携式终端的方法和装置
CN108962239A (zh) 一种基于语音掩蔽的快速配网方法及系统
WO2020114181A1 (zh) 网络语音识别方法、网络业务交互方法及智能耳机
CN105376418A (zh) 一种来电信息处理方法、装置和系统
CN109686370A (zh) 基于语音控制进行斗地主游戏的方法及装置
CN107948854B (zh) 一种操作音频生成方法、装置、终端及计算机可读介质
CN109065050A (zh) 一种语音控制方法、装置、设备及存储介质
CN109600470B (zh) 一种移动终端及其发声控制方法
US20140199981A1 (en) Mobile phone to appliance communication via audio sampling
CN104104997A (zh) 一种电视机静音启动控制方法、装置及系统
CN108962259B (zh) 处理方法及第一电子设备
CN104811792A (zh) 一种通过手机声控电视盒子的系统及方法
WO2016074376A1 (zh) 一种实现语音控制的方法、装置和计算机可读存储介质
WO2016054885A1 (zh) 操作对象的处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13799812

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 14434934

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2013799812

Country of ref document: EP