WO2020029503A1 - 语音控制装置及方法 - Google Patents

语音控制装置及方法 Download PDF

Info

Publication number
WO2020029503A1
WO2020029503A1 PCT/CN2018/121398 CN2018121398W WO2020029503A1 WO 2020029503 A1 WO2020029503 A1 WO 2020029503A1 CN 2018121398 W CN2018121398 W CN 2018121398W WO 2020029503 A1 WO2020029503 A1 WO 2020029503A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
module
local
unit
control device
Prior art date
Application number
PCT/CN2018/121398
Other languages
English (en)
French (fr)
Inventor
王子
梁博
郑文成
Original Assignee
珠海格力电器股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 珠海格力电器股份有限公司 filed Critical 珠海格力电器股份有限公司
Publication of WO2020029503A1 publication Critical patent/WO2020029503A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the present application relates to a voice control device and method, and belongs to the technical field of voice control.
  • voice recognition technology to control intelligent terminals can facilitate human-computer interaction and make people's work and life more convenient.
  • the current voice control technologies mainly include local and online methods: pure local methods are simple and practical, and online methods are powerful and can provide more services, both of which have advantages and disadvantages.
  • local and online hybrid methods Online identification is used when networking, and local identification is switched to when offline; or a terminal or server scores the results of both local and online identification, and the high scorer performs it.
  • an object of the present application is to provide a voice control device and method.
  • a voice control device includes a voice module, a communication module, and a smart terminal, and is characterized in that data is transmitted between the voice module and the smart terminal through a communication module; the voice module is used to collect and broadcast voice, and perform voice processing. Local voice recognition; a voice cloud platform is installed on the smart terminal.
  • the voice control device further includes a microphone, and the microphone is connected to the voice module.
  • the voice control device further includes a speaker, and the speaker is connected to the voice module.
  • the voice module includes a voice acquisition unit, an audio processing unit, a local voice recognition unit, a local voice data unit, and a voice broadcast unit; the voice acquisition unit is connected to the audio processing unit, and the audio processing unit is respectively connected to the local voice
  • the recognition unit is connected to the voice broadcast unit, the local voice recognition unit is connected to the local voice data unit, and the local voice data unit is connected to the voice broadcast unit.
  • the voice acquisition unit is connected to a microphone.
  • the voice broadcasting unit is connected to a speaker.
  • a voice control method includes the steps of: after a voice module collects a sound signal, first identifying whether there is a local wake-up word or a command word in the sound signal; if there is a local wake-up word or a command word, the voice module calls a corresponding stored voice in advance The broadcast data is broadcast; if there is no local wake-up word or command word, the voice module sends the sound signal to the smart terminal for processing through the communication module.
  • the smart terminal processes the data and then sends the data to the voice cloud platform for processing, and the voice cloud platform sends the voice broadcast data to the voice module through the communication module for broadcast.
  • the communication module is Bluetooth or WiFi.
  • the voice module collects sound signals through a microphone, and performs voice broadcasting through a speaker.
  • This application proposes a low-cost voice control solution that can not only support local voice recognition control, but also achieve online control when connected to smart terminals (such as mobile phones, TVs, routers, etc.). Solved the technical defects that the online method cannot accurately identify the scene, which leads to the wrong speech recognition or intentional understanding.
  • FIG. 1 is a structural block diagram of a voice control device of the present application.
  • FIG. 2 is a structural block diagram of a voice module of the present application.
  • FIG. 3 is a flowchart of an embodiment of the present application.
  • the voice device is configured with a microphone, a speaker, a voice module, and a communication module.
  • the microphone and speaker are connected to the voice module.
  • the voice module is responsible for collecting and broadcasting the voice, and performing local voice recognition on the voice.
  • the communication module is responsible for connecting the smart terminal through wireless communication Bluetooth or WiFi.
  • the audio collected by the voice module can send voice data to the smart terminal through the communication module.
  • the smart terminal can also send voice broadcast data to the voice module through the communication module for broadcast.
  • the voice module includes a voice acquisition unit, an audio processing unit, a local voice recognition unit, a local voice data unit, and a voice broadcast unit;
  • the voice acquisition unit is connected to the audio processing unit, and the audio processing unit is connected to the local voice recognition unit and
  • the voice broadcast unit is connected, the local voice recognition unit is connected to the local voice data unit, and the local voice data unit is connected to the voice broadcast unit.
  • the microphone is connected to the voice acquisition unit, and the speaker is connected to the voice broadcast unit.
  • the voice module collects sound analog signals through a microphone, and then converts the analog signals into digital audio data through the audio processing unit, and then sends them to the smart terminal through Bluetooth or WiFi.
  • the digital audio data is obtained through the local speech recognition unit. Matching local wake word or command word.
  • the voice module pre-stores multiple pieces of voice broadcast audio data.
  • the voice module converts the stored corresponding broadcast data into simulations. The signal is broadcast.
  • the audio processing unit also receives the audio data stream sent from the smart terminal, converts the data stream into an analog signal, and sends it to the speaker of the speaker driving the voice broadcasting unit for voice broadcasting.
  • the voice device has a local wake-up word, such as "hello voice”.
  • a local wake-up word such as "hello voice”.
  • the voice device When the user speaks “hello voice”, once the voice device detects the local wake-up word, it will enter the local voice command word recognition mode.
  • the voice device After the voice device is connected to the smart terminal through the communication module, if there is no local wake-up word in the voice spoken by the user, the voice is directly transferred to the smart terminal for processing.
  • the smart terminal runs APP applications of various voice platforms, such as Baidu Map, Tian Cat elf, etc. At this time, if the user says “Tmall elf”, it will wake up the corresponding APP application and provide the voice online service of the platform.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • Telephone Function (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本申请公开了一种语音控制装置及方法。该装置包括语音模块、通信模块和智能终端,其中,语音模块与智能终端之间通过通信模块传输数据;语音模块用于对语音进行采集和播报,并对语音进行本地语音识别;智能终端上安装有语音云平台。本申请是一种低成本的语音控制解决方案,不仅能支持本地语音识别控制,同时当与智能终端(比如手机、电视、路由器等)连接时又能实现在线控制。

Description

语音控制装置及方法
相关申请
本申请要求2018年08月09日申请的,申请号为201810900340.8,名称为“一种语音控制装置及方法”的中国专利申请的优先权,在此将其全文引入作为参考。
技术领域
本申请涉及一种语音控制装置及方法,属于语音控制技术领域。
背景技术
利用语音识别技术控制智能终端,能方便地实现人机交互,使人们的工作和生活更加便捷。
当前语音控制技术主要有本地和在线两种方式:纯本地的方式功能简单、实用性强;而在线方式功能强大,能提供更多服务,这两者各有优劣。另外也有本地和在线混合的方式,联网时采用在线识别,离线时切换到本地识别;或者终端或服务器对本地和在线同时识别的结果打分,高分者执行。
但是,不管采用以上哪种方式,都会存在因支持的功能、场景过多,导致识别结果和意图理解错误,从而进入另外一个场景服务的风险。并且,离在线混合方案的实施成本高。
发明内容
为了解决上述问题,本申请的目的在于提供一种语音控制装置及方法。
本申请的装置采用的技术方案如下:
一种语音控制装置,包括语音模块、通信模块和智能终端,其特征在于,语音模块与智能终端之间通过通信模块传输数据;所述语音模块用于对语音进行采集和播报,并对语音进行本地语音识别;所述智能终端上安装有语音云平台。
在其中一个实施例中,所述语音控制装置还包括麦克风,麦克风与所述语音模块连接。
在其中一个实施例中,所述语音控制装置还包括扬声器,扬声器与所述语音模块连接。
在其中一个实施例中,所述语音模块包括语音采集单元、音频处理单元、本地语音识别单元、本地语音数据单元和语音播报单元;语音采集单元与音频处理单元连接,音频处理单元分别与本地语音识别单元和语音播报单元连接,本地语音识别单元与本地语音数据 单元连接,本地语音数据单元与所述语音播报单元连接。
在其中一个实施例中,所述语音采集单元与麦克风相连。
在其中一个实施例中,所述语音播报单元与扬声器相连。
本申请的方法采用的技术方案如下:
一种语音控制方法,包括如下步骤:语音模块采集声音信号后,先识别声音信号中是否有本地唤醒词或命令词,如果有本地唤醒词或命令词,则语音模块调用预先存储的对应的语音播报数据进行播报;如果没有本地唤醒词或命令词,则语音模块通过通信模块将所述声音信号发给智能终端进行处理。
在其中一个实施例中,所述智能终端处理后再将数据发送给语音云平台处理,语音云平台将语音播报数据通过通信模块发送给语音模块进行播报。
在其中一个实施例中,所述通信模块为蓝牙或WiFi。
在其中一个实施例中,所述语音模块通过麦克风采集声音信号,通过扬声器进行语音播报。
本申请提出了一种低成本的语音控制解决方案,不仅能支持本地语音识别控制,同时当与智能终端(比如手机、电视、路由器等)连接时又能实现在线控制。解决了在线方式无法精准识别场景,导致语音识别或意图理解错误的技术缺陷。
附图说明
图1是本申请语音控制装置的结构框图。
图2是本申请语音模块的结构框图。
图3是本申请实施方式的流程图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方法作进一步地详细描述。
本实施例采用本地语音识别技术、无线音频传输技术,利用手机等智能终端作为计算平台,实现了一种本地在线混合语音识别的低成本解决方案。如图1所示,该语音装置配置有麦克风、扬声器、语音模块和通信模块。麦克风和扬声器接入语音模块,语音模块负责对语音进行采集和播报,并对语音进行本地语音识别。通信模块负责通过无线通信蓝牙或WiFi连接智能终端,语音模块采集的音频可通过通信模块将语音数据发送给智能终端,同时智能终端也可将语音播报数据通过通信模块发送给语音模块进行播报。
如图2所示,语音模块包括语音采集单元、音频处理单元、本地语音识别单元、本地语音数据单元和语音播报单元;语音采集单元与音频处理单元连接,音频处理单元分别与本地语音识别单元和语音播报单元连接,本地语音识别单元与本地语音数据单元连接,本地语音数据单元与语音播报单元连接。麦克风接入语音采集单元,扬声器接入语音播报单元。
具体地,语音模块通过麦克风采集声音模拟信号,然后将模拟信号通过音频处理单元转换成数字音频数据,然后通过蓝牙或WiFi等方式发给智能终端,同时将数字音频数据经本地语音识别单元得出匹配的本地唤醒词或命令词。对于语音模块本地识别的唤醒词和命令词,其语音模块预先存储有多条语音播报音频数据,当有本地唤醒词或命令词被识别时,语音模块将调用存储的对应的播报数据转换成模拟信号进行播报。音频处理单元也接收来自智能终端发过来的音频数据流,并将数据流转换成模拟信号发送给语音播报单元驱动扬声器的喇叭进行语音播报。
如图3所示,语音装置具有本地唤醒词,比如“你好语音”,当用户说出“你好语音”时,语音装置一旦检测为本地唤醒词,将进入本地语音命令词识别模式,此时用户说本地命令次,比如“开机”、“关机”,将执行对应命令动作。当语音装置通过通信模块连接智能终端后,若用户所说语音中若无本地唤醒词,则直接将语音转给智能终端进行处理,智能终端上运行各语音平台的APP应用,比如百度地图、天猫精灵等,此时用户若说“天猫精灵”,将唤醒对应APP应用,并提供该平台的语音在线服务。

Claims (10)

  1. 一种语音控制装置,包括语音模块、通信模块和智能终端,其特征在于,所述语音模块与所述智能终端之间通过所述通信模块传输数据;所述语音模块用于对语音进行采集和播报,并对语音进行本地语音识别;所述智能终端上安装有语音云平台。
  2. 根据权利要求1所述的一种语音控制装置,其特征在于,所述语音控制装置还包括麦克风,所述麦克风与所述语音模块连接。
  3. 根据权利要求1或2所述的一种语音控制装置,其特征在于,所述语音控制装置还包括扬声器,所述扬声器与所述语音模块连接。
  4. 根据权利要求1所述的一种语音控制装置,其特征在于,所述语音模块包括语音采集单元、音频处理单元、本地语音识别单元、本地语音数据单元和语音播报单元;所述语音采集单元与所述音频处理单元连接,所述音频处理单元分别与所述本地语音识别单元和所述语音播报单元连接,所述本地语音识别单元与所述本地语音数据单元连接,所述本地语音数据单元与所述语音播报单元连接。
  5. 根据权利要求4所述的一种语音控制装置,其特征在于,所述语音采集单元与麦克风相连。
  6. 根据权利要求4或5所述的一种语音控制装置,其特征在于,所述语音播报单元与扬声器相连。
  7. 一种语音控制方法,其特征在于,包括如下步骤:
    语音模块采集声音信号后,先识别所述声音信号中是否有本地唤醒词或命令词,如果有所述本地唤醒词或命令词,则所述语音模块调用预先存储的对应的语音播报数据进行播报;如果没有所述本地唤醒词或命令词,则所述语音模块通过通信模块将所述声音信号发给智能终端进行处理。
  8. 根据权利要求7所述的一种语音控制方法,其特征在于,所述智能终端处理后再将数据发送给语音云平台处理,所述语音云平台将语音播报数据通过所述通信模块发送给所述语音模块进行播报。
  9. 根据权利要求7所述的一种语音控制方法,其特征在于,所述通信模块为蓝牙或WiFi。
  10. 根据根据权利要求7至9之一所述的一种语音控制方法,其特征在于,所述语音模块通过麦克风采集声音信号,通过扬声器进行语音播报。
PCT/CN2018/121398 2018-08-09 2018-12-17 语音控制装置及方法 WO2020029503A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810900340.8 2018-08-09
CN201810900340.8A CN108877799A (zh) 2018-08-09 2018-08-09 一种语音控制装置及方法

Publications (1)

Publication Number Publication Date
WO2020029503A1 true WO2020029503A1 (zh) 2020-02-13

Family

ID=64317641

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/121398 WO2020029503A1 (zh) 2018-08-09 2018-12-17 语音控制装置及方法

Country Status (2)

Country Link
CN (1) CN108877799A (zh)
WO (1) WO2020029503A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108877799A (zh) * 2018-08-09 2018-11-23 珠海格力电器股份有限公司 一种语音控制装置及方法
CN111292716A (zh) * 2020-02-13 2020-06-16 百度在线网络技术(北京)有限公司 语音芯片和电子设备
CN111726807A (zh) * 2020-04-22 2020-09-29 深圳市伟文无线通讯技术有限公司 一种语音交互实现嵌入式wifi模块入网的装置和方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106098062A (zh) * 2016-06-16 2016-11-09 杭州古北电子科技有限公司 本地处理与无线网络结合的智能语音识别控制系统及方法
CN106448664A (zh) * 2016-10-28 2017-02-22 魏朝正 一种通过语音控制智能家居设备的系统及方法
CN106452997A (zh) * 2016-09-30 2017-02-22 无锡小天鹅股份有限公司 家用电器及其控制系统
CN107146617A (zh) * 2017-06-15 2017-09-08 成都启英泰伦科技有限公司 一种新型语音识别设备及方法
CN107274902A (zh) * 2017-08-15 2017-10-20 深圳诺欧博智能科技有限公司 用于家电的语音控制装置和方法
CN108877799A (zh) * 2018-08-09 2018-11-23 珠海格力电器股份有限公司 一种语音控制装置及方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9009025B1 (en) * 2011-12-27 2015-04-14 Amazon Technologies, Inc. Context-based utterance recognition
CN107369445A (zh) * 2016-05-11 2017-11-21 上海禹昌信息科技有限公司 同时支持语音唤醒以及语音控制智能终端的方法
CN107424607B (zh) * 2017-07-04 2023-06-06 珠海格力电器股份有限公司 语音控制模式切换方法、装置及具有该装置的设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106098062A (zh) * 2016-06-16 2016-11-09 杭州古北电子科技有限公司 本地处理与无线网络结合的智能语音识别控制系统及方法
CN106452997A (zh) * 2016-09-30 2017-02-22 无锡小天鹅股份有限公司 家用电器及其控制系统
CN106448664A (zh) * 2016-10-28 2017-02-22 魏朝正 一种通过语音控制智能家居设备的系统及方法
CN107146617A (zh) * 2017-06-15 2017-09-08 成都启英泰伦科技有限公司 一种新型语音识别设备及方法
CN107274902A (zh) * 2017-08-15 2017-10-20 深圳诺欧博智能科技有限公司 用于家电的语音控制装置和方法
CN108877799A (zh) * 2018-08-09 2018-11-23 珠海格力电器股份有限公司 一种语音控制装置及方法

Also Published As

Publication number Publication date
CN108877799A (zh) 2018-11-23

Similar Documents

Publication Publication Date Title
US9978369B2 (en) Method and apparatus for voice control of a mobile device
WO2020029503A1 (zh) 语音控制装置及方法
CN203721183U (zh) 一种语音唤醒装置
TWI489372B (zh) 語音操控方法與行動終端裝置
WO2015009086A1 (en) Multi-level speech recognition
CN107134286A (zh) 基于语音交互的无线音频播放方法、音乐播放器及存储介质
CN109348051A (zh) 自动接听手机通话的方法、装置、设备及介质
CN206819732U (zh) 智能音乐播放器
CN105677290B (zh) 语音应用程序的控制方法及客户端
CN103577144A (zh) 车载设备的语音输入方法及其语音输入系统
CN205901877U (zh) 一种车载安卓手机连接装置
CN108900270A (zh) 用于列车广播系统的方法及数字化列车广播系统
CN103745720A (zh) 一种带有语音识别的蓝牙系统
CN103634448A (zh) 一种来电智能语音回复方法
CN105915248B (zh) 一种智能招车系统的车载终端
CN111524512A (zh) 低延时开启one-shot语音对话的方法、外围设备及低延时响应的语音交互装置
CN203349836U (zh) 车载一键通语音导航终端
US9137645B2 (en) Apparatus and method for dynamic call based user ID
CN208971527U (zh) 一种数字化列车广播系统
CN106899617B (zh) 一种基于v2x技术的车车聊天系统
CN104158566A (zh) 车载通信机构与无线耳机的连接控制方法及装置
CN110400568A (zh) 智能语音系统的唤醒方法、智能语音系统及车辆
CN106657539A (zh) 一种车载多功能免操作智能服务装置
CN106528789A (zh) 一种基于机器人的智能服务系统
CN110351690B (zh) 一种智能语音系统及其语音处理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18929662

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18929662

Country of ref document: EP

Kind code of ref document: A1