WO2021008095A1 - 线下远场语音控制系统、控制方法及设备 - Google Patents

线下远场语音控制系统、控制方法及设备 Download PDF

Info

Publication number
WO2021008095A1
WO2021008095A1 PCT/CN2019/130505 CN2019130505W WO2021008095A1 WO 2021008095 A1 WO2021008095 A1 WO 2021008095A1 CN 2019130505 W CN2019130505 W CN 2019130505W WO 2021008095 A1 WO2021008095 A1 WO 2021008095A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
offline
far
user
data
Prior art date
Application number
PCT/CN2019/130505
Other languages
English (en)
French (fr)
Inventor
郭志俊
熊跃平
徐立
Original Assignee
深圳创维-Rgb电子有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳创维-Rgb电子有限公司 filed Critical 深圳创维-Rgb电子有限公司
Publication of WO2021008095A1 publication Critical patent/WO2021008095A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42204User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor
    • H04N21/42206User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor characterized by hardware details
    • H04N21/42222Additional components integrated in the remote control device, e.g. timer, speaker, sensors for detecting position, direction or movement of the remote control, microphone or battery charging device
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Definitions

  • This application relates to the technical field of electronic equipment, and in particular to an offline far-field voice control system, control method and equipment.
  • the main purpose of this application is to provide an offline far-field voice control system, control method and equipment, aiming to solve the technical problem of inconvenience to users in the prior art that using a remote control to control a TV without a network .
  • the offline far-field voice control system includes a voice acquisition module, a voice recognition module, a compilation and compression module, and a control chip; wherein,
  • the voice collection module is configured to collect user voice to obtain corresponding voice data
  • the voice recognition module is configured to determine that the voice data exists in a preset voice library
  • the compiling and compressing module is configured to compile and compress the voice data to obtain operation instructions
  • the control chip is configured to control the TV to implement corresponding operations according to the operation instructions.
  • the voice collection module further includes a microphone, a first collection unit, a second collection unit, and a filtering unit;
  • the first collection unit is connected to the microphone and is configured to collect voice through the microphone to obtain voice collection data, and send the voice collection data to the filtering unit;
  • the second collection unit is connected to the power amplifier of the TV, and is configured to collect the power amplifier voice data and send the power amplifier voice data to the filtering unit;
  • the filtering unit is connected to the voice recognition module and is configured to filter out the power amplifier voice data in the voice collection data, so as to use the filtered voice collection data as the voice data corresponding to the user's voice.
  • a trigger module connected to the voice collection module, configured to detect the user's voice in real time, and trigger the voice collection when it is determined that the voice data corresponding to the user's voice contains a preset activation instruction
  • the module collects user voice.
  • it further includes a display connected to the control chip and configured to display an interface corresponding to the operation on a television.
  • This application also proposes an offline far-field voice control method, which includes the following steps:
  • the television is controlled to realize the corresponding operation.
  • the step of controlling the TV to implement corresponding operations according to the operation instructions includes:
  • the step of matching the operation instruction with a preset control instruction library includes:
  • the step of collecting user voice to obtain corresponding voice data includes:
  • the user's voice is detected in real time, and when it is determined that the voice data corresponding to the user's voice contains a preset activation instruction, the user's voice is collected to obtain the corresponding voice data.
  • This application also proposes an offline far-field voice control device.
  • the electronic device includes the offline far-field voice control system as described above, or the offline far-field voice control device applies the offline far-field as described above. Voice control method.
  • This application collects the user's voice through the voice collection module to obtain corresponding voice data; the voice recognition module determines that voice data exists in the preset voice library; the compile and compression module compiles and compresses the voice data to obtain operating instructions; the control chip is based on the operating instructions Control the TV to realize the corresponding operation.
  • the traditional TV has no operating system and is not connected to the Internet, the user can control the TV through the user's voice collection and recognition. The user only needs to issue a voice command to achieve the interaction with the TV, and no longer rely on the remote control. , To meet the individual needs of TV users and make the TV more intelligent.
  • Fig. 1 is a functional module diagram of an embodiment of an offline far-field voice system of the present application
  • FIG. 2 is a schematic structural diagram of an embodiment of an offline far-field voice system of the present application.
  • Fig. 3 is a flowchart of an embodiment of an offline far-field voice method according to the present application.
  • Label name Label name 100 Voice acquisition module 600 monitor 200 Speech recognition module 110 microphone 300 Compile compression module 120 First collection unit 400 Control chip 130 Second collection unit 500 Trigger module 140 Filter unit
  • This application provides an offline far-field voice control system.
  • the offline far-field voice control system includes a voice acquisition module 100, a voice recognition module 200, a compilation and compression module 300, and a control chip 400; wherein, the voice acquisition module 100 is configured In order to collect user voice and obtain corresponding voice data; the voice recognition module 200 is configured to determine that the voice data exists in a preset voice library; the compiling and compressing module 300 is configured to analyze the voice data Compile and compress to obtain operation instructions; the control chip 400 is configured to control the TV to implement corresponding operations according to the operation instructions.
  • the traditional TV without operating system and network function can only be controlled by the remote control.
  • This embodiment can realize the function of voice control TV without connecting to the Internet, which relieves the TV.
  • the long-term use of the remote control to operate the TV set enhances the user's sense of experience and technology.
  • the voice acquisition module 100 is connected to the voice recognition module 200
  • the voice recognition module 200 is connected to the compiling and compressing module 300
  • the compiling and compressing module 300 is connected to the control chip 400 through a serial port to transmit operation instructions to the control chip 400 of the TV.
  • the TV will make the corresponding operation after receiving the operation instruction, so as to achieve the purpose that the user can perfectly control the TV through voice.
  • the offline far-field voice control system is powered by the TV's internal power supply +5V_Standby.
  • the TV When the TV is powered on, there will be a stable +5V_Standby output regardless of whether the TV is turned on or not. Therefore, even if the TV is on In the standby state, the offline far-field voice control system can also operate normally and can respond to the user's voice control.
  • the compilation and compression module compiles and compresses the voice data, obtains operating instructions, and operates
  • the instructions are transmitted to the control module of the TV through the serial port RX/TX.
  • the control module automatically parses and compares the operating instructions to make the TV take corresponding actions, such as turning on the TV, adding channels, muting and other related operations. Realize the purpose of user's far-field voice control.
  • the preset voice library settings can be as shown in the following table:
  • the user’s voice is collected through the voice collection module to obtain corresponding voice data; the voice recognition module determines that voice data exists in the preset voice library; the compilation and compression module compiles and compresses the voice data to obtain operating instructions; the control chip operates according to The instruction controls the TV to realize the corresponding operation.
  • the traditional TV has no operating system and is not connected to the Internet, the user can control the TV through the user's voice collection and recognition. The user only needs to issue a voice command to achieve the interaction with the TV, and no longer rely on the remote control. , To meet the individual needs of TV users and make the TV more intelligent.
  • FIG. 2 is a schematic structural diagram of an embodiment of an offline far-field voice system.
  • the voice collection module 100 further includes a microphone 110, a first collection unit 120, a second collection unit 130, and a filtering unit 140;
  • the first collection unit 120 is connected to the microphone 110 and is configured to The voice is collected through the microphone 110 to obtain the voice collection data, and the voice collection data is sent to the filtering unit 140;
  • the second collection unit 130 is connected to the power amplifier of the TV and is configured to collect the power amplifier voice data, And send the power amplifier voice data to the filtering unit 140;
  • the filtering unit 140 is connected to the voice recognition module 200, and is configured to filter the power amplifier voice data in the voice collection data to remove the filtered voice collection data As the voice data corresponding to the user's voice.
  • the voice collection data includes user voice and power amplifier voice data. Due to the complicated environmental noise when the TV is in the playing state, in order to improve the accurate recognition of the TV’s voice control instructions and filter out the power amplifier sound from the TV’s power amplifier, a special design of retrieving function is designed to feed back the audio signal of the TV itself.
  • the voice collection module 100 enables the power amplifier sound emitted by the television to be recognized, thereby accurately recognizing the user's voice.
  • the microphone 110 in order to further enhance the user experience, improve the voice signal processing quality and the voice recognition rate in the real environment, and enhance the accuracy of user voice recognition, multiple microphones can be used to collect different directions. User voice. In this embodiment, there are optionally at least two microphones 110.
  • the offline far-field voice control system further includes a trigger module 500, which is connected to the voice collection module 100 and is configured to detect the user's voice in real time, and when it is determined that the voice data corresponding to the user's voice contains a preset When the instruction is activated, the voice collection module 100 is triggered to collect the user's voice.
  • a trigger module 500 which is connected to the voice collection module 100 and is configured to detect the user's voice in real time, and when it is determined that the voice data corresponding to the user's voice contains a preset
  • the voice collection module 100 is triggered to collect the user's voice.
  • a preset activation instruction library is stored in the trigger module 500, and the voice collection module 100 can be activated within a preset time only when the voice data corresponding to the user's voice has a preset activation password.
  • the user's voice is continuously collected.
  • the offline far-field voice control system enters the dormant state, and can only be activated again after receiving the activation password.
  • the offline far-field voice control system further includes a display 600 connected to the control chip 400 and configured to display an interface corresponding to the operation on a television.
  • the display 600 is a display of a TV set and is configured to display an interface corresponding to the operation. For example, when the collected user voice is "open the menu", the menu interface will be displayed on the display.
  • the TV and its own power amplifier sounds are collected and filtered, and user voices in multiple directions are collected through multiple microphones, which realizes the collection of real user voices and improves the The recognition of the user’s voice.
  • FIG. 3 is a flowchart of an embodiment of an offline far-field voice method based on an offline far-field voice system.
  • the offline far-field voice method includes the following steps:
  • S10 Collect user voice to obtain corresponding voice data
  • this embodiment uses multiple microphones to collect user voices in different directions, which further improves the user experience, improves the voice signal processing quality and the voice recognition rate in a real environment, and enhances the accuracy of user voice recognition.
  • the user's voice needs to be detected in real time, and the user's voice is collected only when it is determined that the sound data corresponding to the user's voice contains a preset activation instruction.
  • the trigger module has a preset activation instruction library. Only when the user’s sound data has a preset activation password in the sound data, can the voice collection module be activated for a preset period of time. The user's voice is collected. When the user's voice is not collected within the preset time, the offline far-field voice control system enters the dormant state and can only be activated again after receiving the activation password.
  • the preset voice library settings can be as shown in the following table:
  • the compiling and compressing module compiles and compresses the voice data, obtains the operation instruction, and transmits the operation instruction to the control module of the TV through the serial port RX/TX.
  • the control module matches the operation instruction with the preset control instruction library, and when the matching is successful, realizes the corresponding operation according to the matched control instruction corresponding to the operation instruction in the preset control instruction library, for example, TV Turn on, increase channels, mute and other related operations to achieve the purpose of far-field voice control for users.
  • control instruction corresponding to the operation instruction can be obtained in the manner of string similarity matching.
  • the corresponding voice data is obtained by collecting user voice; determining that the voice data exists in the preset voice library; compiling and compressing the voice data to obtain operation instructions; controlling the TV according to the operation instructions
  • the machine realizes the corresponding operation.
  • the traditional TV has no operating system and is not connected to the Internet
  • the user can control the TV through the user's voice collection and recognition.
  • the user only needs to issue a voice command to achieve the interaction with the TV, and no longer rely on the remote control. , To meet the individual needs of TV users and make the TV more intelligent.
  • this application also proposes an offline far-field voice control device, which includes the offline far-field voice control system as described above, or applies the above-mentioned offline far-field voice control method. It is easy to understand that the offline far-field voice control device has at least the effects brought about by the above-mentioned embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Selective Calling Equipment (AREA)

Abstract

一种线下远场语音系统、控制方法及设备。通过语音采集模块(100)对用户语音进行采集,获得对应的语音数据(S10);语音识别模块(200)确定预设语音库中存在语音数据(S20);编译压缩模块(300)对语音数据进行编译压缩,获得操作指令(S30);控制芯片(400)根据操作指令控制电视机实现对应的操作(S40)。

Description

线下远场语音控制系统、控制方法及设备
本申请要求于2019年07月16日提交中国专利局、申请号为CN201910644412.1、发明名称为“线下远场语音控制系统、控制方法及设备”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及电子设备技术领域,尤其涉及一种线下远场语音控制系统、控制方法及设备。
背景技术
随着数字电视技术的发展和普及,电视机已成为家庭的娱乐中心,而电视机用户对电视的智能化和个性化要求也越来越高。
目前对于传统的无操作系统、不带网络功能的电视机,用户通常使用遥控器对电视机进行控制,并使电视机做出相对应的动作,例如,电视开机、增加频道、静音等相关操作,当用户需要对电视机进行控制时,需要先找到遥控器,再通过遥控器按钮给出控制指令,非常不方便。
上述内容仅用于辅助理解本申请的技术方案,并不代表承认上述内容是现有技术。
技术解决方案
本申请的主要目的在于提供一种线下远场语音控制系统、控制方法及设备,旨在解决现有技术中在没有网络的情况下使用遥控器控制电视机给用户带来不方便的技术问题。
为实现上述目的,本申请提供一种线下远场语音控制系统,所述线下远场语音控制系统包括语音采集模块、语音识别模块、编译压缩模块以及控制芯片;其中,
所述语音采集模块,被配置为对用户语音进行采集,获得对应的语音数据;
所述语音识别模块,被配置为确定预设语音库中存在所述语音数据;
所述编译压缩模块,被配置为对所述语音数据进行编译压缩,获得操作指令;
所述控制芯片,被配置为根据所述操作指令控制电视机实现对应的操作。
可选地,所述语音采集模块还包括麦克风、第一采集单元、第二采集单元及过滤单元;
所述第一采集单元,与所述麦克风连接,被配置为通过麦克风对语音进行采集以获得语音采集数据,并发送语音采集数据至所述过滤单元;
所述第二采集单元,与电视机的功放连接,被配置为对功放语音数据进行回采,并发送所述功放语音数据至过滤单元;
所述过滤单元,与所述语音识别模块连接,被配置为滤除语音采集数据中的功放语音数据,以将滤除后的语音采集数据作为用户语音对应的语音数据。
可选地,还包括触发模块,与所述语音采集模块连接,被配置为实时侦测用户发出的声音,并在确定用户发出的声音对应的声音数据包含预设激活指令时触发所述语音采集模块对用户语音进行采集。
可选地,所述麦克风至少有两个。
可选地,还包括显示器,与所述控制芯片连接,被配置为在电视机上显示所述操作对应的界面。
本申请还提出一种线下远场语音控制方法,所述方法包括以下步骤:
对用户语音进行采集,获得对应的语音数据;
确定预设语音库中存在所述语音数据;
对所述语音数据进行编译压缩,获得操作指令;以及,
根据所述操作指令控制电视机实现对应的操作。
可选地,所述根据所述操作指令控制电视机实现对应的操作的步骤,包括:
将所述操作指令与预设控制指令库进行匹配;以及,
确定匹配成功,根据所述预设控制指令库中所述操作指令对应匹配的控制指令实现对应的操作。
可选地,所述将所述操作指令与预设控制指令库进行匹配的步骤,包括:
将所述操作指令与所述预设控制指令库中的控制指令进行字符串相似度匹配;以及,
确定所述操作指令与控制指令的字符串相似度在第一预设范围内时匹配成功。
可选地,所述对用户语音进行采集,获得对应的语音数据的步骤,包括:
实时侦测用户发出的声音,并在确定用户发出的声音对应的声音数据包含预设激活指令时对用户语音进行采集,获得对应的语音数据。
本申请还提出一种线下远场语音控制设备,所述电子设备包括如上所述的线下远场语音控制系统,或者所述线下远场语音控制设备应用如上所述的线下远场语音控制方法。
本申请通过语音采集模块对用户语音进行采集,获得对应的语音数据;语音识别模块确定预设语音库中存在语音数据;编译压缩模块对语音数据进行编译压缩,获得操作指令;控制芯片根据操作指令控制电视机实现对应的操作。其中,在传统电视机无操作系统、不联网的情况下,通过对用户语音采集识别实现对电视机的控制,用户只需要发出语音指令即可实现与电视机的交互,不再依赖于遥控器,满足了电视机用户的个性化需求,使电视机更加智能化。
附图说明
图1是本申请一种线下远场语音系统一实施例的功能模块图;
图2是本申请一种线下远场语音系统一实施例的结构示意图;
图3是本申请一种线下远场语音方法一实施例的流程图。
附图标号说明:
标号 名称 标号 名称
100 语音采集模块 600 显示器
200 语音识别模块 110 麦克风
300 编译压缩模块 120 第一采集单元
400 控制芯片 130 第二采集单元
500 触发模块 140 过滤单元
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
本发明的实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请的一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
需要说明,若本申请实施例中有涉及方向性指示(诸如上、下、左、右、前、后……),则该方向性指示仅用于解释在某一特定姿态(如附图所示)下各部件之间的相对位置关系、运动情况等,如果该特定姿态发生改变时,则该方向性指示也相应地随之改变。
另外,若本申请实施例中有涉及“第一”、“第二”等的描述,则该“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。
本申请提供一种线下远场语音控制系统。
参照图1,在一实施例中,所述线下远场语音控制系统包括语音采集模块100、语音识别模块200、编译压缩模块300以及控制芯片400;其中,所述语音采集模块100,被配置为对用户语音进行采集,获得对应的语音数据;所述语音识别模块200,被配置为确定预设语音库中存在所述语音数据;所述编译压缩模块300,被配置为对所述语音数据进行编译压缩,获得操作指令;所述控制芯片400,被配置为根据所述操作指令控制电视机实现对应的操作。
应当理解的是,传统的无操作系统不带网络功能的电视机,只能通过遥控器进行控制,本实施例在不用连接互联网的情况下就能够实现语音控制电视机的功能,解脱了电视机用户长期使用遥控器操作电视机的束缚,增强了用户的体验感和科技感。
需要说明的是,语音采集模块100与语音识别模块200连接,语音识别模块200与编译压缩模块300连接,编译压缩模块300与控制芯片400通过串口连接,将操作指令传输至电视机的控制芯片400,电视机在接收到操作指令后作出相应的操作,从而达到用户通过语音就能够完美控制电视机的目的。
在实现过程中,线下远场语音控制系统由电视机内部电源+5V_Standby供电,在电视机通电的状态下,无论电视机是否已经开机,都会有稳定的+5V_Standby输出,因此,即便电视机在待机状态下,线下远场语音控制系统也能正常运行,能够响应用户的语音控制。
当然,传统的无操作系统的电视机的不联网的情况下,通过数据库比对的方式确定预设语音库中存在语音数据,编译压缩模块对语音数据进行编译压缩,获得操作指令,并将操作指令通过串口RX/TX的形式传输到电视机的控制模块,控制模块对操作指令进行自动解析和比对,使电视机做出相应的动作,例如,电视开机、增加频道、静音等相关操作,实现用户的远场语音控制的目的。其中,预设语音库的设置可以如下表所示:
电视开机
电视关机
增加频道
减少频道
增大音量
减小音量
静音
取消静音
打开菜单
关闭菜单
切换到AV
切换到ATV
切换到DTV
切换到HDMI1
切换到HDMI2
切换到HDMI3
切换到USB
暂停播放
快进
快退
开始播放
停止播放
退出
向上移动
向下移动
向左移动
向右移动
左切换项目
右切换项目
返回
本实施例通过语音采集模块对用户语音进行采集,获得对应的语音数据;语音识别模块确定预设语音库中存在语音数据;编译压缩模块对语音数据进行编译压缩,获得操作指令;控制芯片根据操作指令控制电视机实现对应的操作。其中,在传统电视机无操作系统、不联网的情况下,通过对用户语音采集识别实现对电视机的控制,用户只需要发出语音指令即可实现与电视机的交互,不再依赖于遥控器,满足了电视机用户的个性化需求,使电视机更加智能化。
请参照图2,图2为线下远场语音系统一实施例的结构示意图。
本实施例中,所述语音采集模块100还包括麦克风110、第一采集单元120、第二采集单元130及过滤单元140;所述第一采集单元120,与所述麦克风110连接,被配置为通过麦克风110对语音进行采集以获得语音采集数据,并发送语音采集数据至所述过滤单元140;所述第二采集单元130,与电视机的功放连接,被配置为对功放语音数据进行回采,并发送所述功放语音数据至过滤单元140;所述过滤单元140,与所述语音识别模块200连接,被配置为滤除语音采集数据中的功放语音数据,以将滤除后的语音采集数据作为用户语音对应的语音数据。
需要说明的是,所述语音采集数据,包括用户语音及功放语音数据。由于电视机在播放状态下时环境噪音复杂,为了提高电视机的语音控制指令的准确识别,滤除电视机的功放发出的功放声音,特设计了回采功能,把电视机本身的音频信号反馈到语音采集模块100,使电视机发出的功放声音被识别,从而准确的识别到用户语音。
易于理解的是,由于通过麦克风110采集用户语音,为了进一步提升用户体验感,提高语音信号处理质量及真实环境下的语音识别率,增强用户语音识别的准确度,可以采用多个麦克风采集不同方向的用户语音。本实施例中,麦克风110可选地为至少两个。
进一步地,线下远场语音控制系统还包括触发模块500,与所述语音采集模块100连接,被配置为实时侦测用户发出的声音,并在确定用户发出的声音对应的声音数据包含预设激活指令时触发所述语音采集模块100对用户语音进行采集。
应当理解的是,为了防止误操作,触发模块500中保存有预设的激活指令库,只有当用户发出的声音对应的声音数据中有预设激活口令才能激活语音采集模块100在预设时间内持续对用户语音进行采集,当预设时间内没有采集到用户语音,线下远场语音控制系统进入休眠状态,只有收到激活口令才能再次激活。
进一步地,线下远场语音控制系统还包括显示器600,与所述控制芯片400连接,被配置为在电视机上显示所述操作对应的界面。
可理解的是,所述显示器600为电视机的显示器,被配置为显示操作对应的界面,如当采集到的用户语音为“打开菜单”时,在显示器上会显示菜单界面。
本实施通过语音采集模块的设计,对电视同本身的功放声音进行回采和滤除,并通过多个麦克风对多个方向的用户语音进行收集,实现了对真实用户语音的采集,并提高了对用户语音的识别度。
请参照图3,图3为基于线下远场语音系统提出的线下远场语音方法一实施例的流程图。
本实施例中,线下远场语音方法包括以下步骤:
S10:对用户语音进行采集,获得对应的语音数据;
易于理解的是,本实施例通过多个麦克风采集不同方向的用户语音,进一步提升了用户体验感,提高了语音信号处理质量及真实环境下的语音识别率,增强用户语音识别的准确度。
在实现过程中,在对用户语音进行采集之前,还需要实时侦测用户发出的声音,并在确定用户发出的声音对应的声音数据包含预设激活指令时才对用户语音进行采集。
应当理解的是,为了防止误操作,触发模块中保存有预设的激活指令库,只有当用户发出的声音对应的声音数据中有预设激活口令才能激活语音采集模块在预设时间内持续对用户语音进行采集,当预设时间内没有采集到用户语音,线下远场语音控制系统进入休眠状态,只有收到激活口令才能再次激活。
S20:确定预设语音库中存在所述语音数据;
需要说明的是,本实施例中,确定预设语音库中存在所述语音数据的方法有多种,比如将语音数据与预设语音库中的数据进行词向量相似度匹配或字符串相似度匹配等。预设语音库的设置可以如下表所示:
电视开机
电视关机
增加频道
减少频道
增大音量
减小音量
静音
取消静音
打开菜单
关闭菜单
切换到AV
切换到ATV
切换到DTV
切换到HDMI1
切换到HDMI2
切换到HDMI3
切换到USB
暂停播放
快进
快退
开始播放
停止播放
退出
向上移动
向下移动
向左移动
向右移动
左切换项目
右切换项目
返回
S30:对所述语音数据进行编译压缩,获得操作指令;
易于理解的是,当预设语音库中存在语音数据时,编译压缩模块对语音数据进行编译压缩,获得操作指令,并将操作指令通过串口RX/TX的形式传输到电视机的控制模块。
S40:根据所述操作指令控制电视机实现对应的操作。
在实现过程中,控制模块将操作指令与预设控制指令库进行匹配,在匹配成功时,根据所述预设控制指令库中所述操作指令对应匹配的控制指令实现对应的操作,例如,电视开机、增加频道、静音等相关操作,实现用户的远场语音控制的目的。
本实施例中,可以采用字符串相似度匹配的方式获得操作指令对应的控制指令。将所述操作指令与所述预设控制指令库中的控制指令进行字符串相似度匹配;在所述操作指令与控制指令的字符串相似度在第一预设范围内时,判定匹配成功。
本实施例通过对用户语音进行采集,获得对应的语音数据;确定所述预设语音库中存在所述语音数据;对所述语音数据进行编译压缩,获得操作指令;根据所述操作指令控制电视机实现对应的操作。其中,在传统电视机无操作系统、不联网的情况下,通过对用户语音采集识别实现对电视机的控制,用户只需要发出语音指令即可实现与电视机的交互,不再依赖于遥控器,满足了电视机用户的个性化需求,使电视机更加智能化。
此外,本申请还提出一种线下远场语音控制设备,该线下远场语音控制设备包括如上所述的线下远场语音控制系统,或者应用上述的线下远场语音控制方法。易于理解的是,该线下远场语音控制设备至少具有上述实施例所带来的效果。
以上仅为本申请的可选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种线下远场语音控制系统,其中,包括语音采集模块、语音识别模块、编译压缩模块以及控制芯片;其中,
    所述语音采集模块,被配置为对用户语音进行采集,获得对应的语音数据;
    所述语音识别模块,被配置为确定预设语音库中存在所述语音数据;
    所述编译压缩模块,被配置为对所述语音数据进行编译压缩,获得操作指令;
    所述控制芯片,被配置为根据所述操作指令控制电视机实现对应的操作。
  2. 如权利要求1所述的线下远场语音控制系统,其中,所述语音采集模块还包括麦克风、第一采集单元、第二采集单元及过滤单元;
    所述第一采集单元,与所述麦克风连接,被配置为通过麦克风对语音进行采集以获得语音采集数据,并发送语音采集数据至所述过滤单元;
    所述第二采集单元,与电视机的功放连接,被配置为对功放语音数据进行回采,并发送所述功放语音数据至过滤单元;
    所述过滤单元,与所述语音识别模块连接,被配置为滤除语音采集数据中的功放语音数据,以将滤除后的语音采集数据作为用户语音对应的语音数据。
  3. 如权利要求2所述的线下远场语音控制系统,其中,还包括触发模块,与所述语音采集模块连接,被配置为实时侦测用户发出的声音,并在确定用户发出的声音对应的声音数据包含预设激活指令时触发所述语音采集模块对用户语音进行采集。
  4. 如权利要求2所述的线下远场语音控制系统,其中,所述麦克风至少有两个。
  5. 如权利要求1所述的线下远场语音控制系统,其中,还包括显示器,与所述控制芯片连接,被配置为显示所述操作对应的界面。
  6. 一种线下远场语音控制方法,其中,所述线下远场语音控制方法包括以下步骤:
    对用户语音进行采集,获得对应的语音数据;
    确定预设语音库中存在所述语音数据;
    对所述语音数据进行编译压缩,获得操作指令;以及,
    根据所述操作指令控制电视机实现对应的操作。
  7. 如权利要求6所述的线下远场语音控制方法,其中,所述根据所述操作指令控制电视机实现对应的操作的步骤,包括:
    将所述操作指令与预设控制指令库进行匹配;以及,
    确定匹配成功,根据所述预设控制指令库中所述操作指令对应匹配的控制指令实现对应的操作。
  8. 如权利要求7所述的线下远场语音控制方法,其中,所述将所述操作指令与预设控制指令库进行匹配的步骤,包括:
    将所述操作指令与所述预设控制指令库中的控制指令进行字符串相似度匹配;以及,
    确定所述操作指令与控制指令的字符串相似度在第一预设范围内时匹配成功。
  9. 如权利要求6所述的线下远场语音控制方法,其中,所述对用户语音进行采集,获得对应的语音数据的步骤,包括:
    实时侦测用户发出的声音,并在确定用户发出的声音对应的声音数据包含预设激活指令时对用户语音进行采集,获得对应的语音数据。
  10. 一种线下远场语音控制设备,其中,所述线下远场语音控制设备包括线下远场语音控制系统,所述线下远场语音控制系统包括语音采集模块、语音识别模块、编译压缩模块以及控制芯片;其中,
    所述语音采集模块,被配置为对用户语音进行采集,获得对应的语音数据;
    所述语音识别模块,被配置为确定预设语音库中存在所述语音数据;
    所述编译压缩模块,被配置为对所述语音数据进行编译压缩,获得操作指令;
    所述控制芯片,被配置为根据所述操作指令控制电视机实现对应的操作。
  11. 如权利要求10所述的线下远场语音控制设备,其中,所述语音采集模块还包括麦克风、第一采集单元、第二采集单元及过滤单元;
    所述第一采集单元,与所述麦克风连接,被配置为通过麦克风对语音进行采集以获得语音采集数据,并发送语音采集数据至所述过滤单元;
    所述第二采集单元,与电视机的功放连接,被配置为对功放语音数据进行回采,并发送所述功放语音数据至过滤单元;
    所述过滤单元,与所述语音识别模块连接,被配置为滤除语音采集数据中的功放语音数据,以将滤除后的语音采集数据作为用户语音对应的语音数据。
  12. 如权利要求11所述的线下远场语音控制设备,其中,所述线下远场语音控制系统还包括触发模块,与所述语音采集模块连接,被配置为实时侦测用户发出的声音,并在确定用户发出的声音对应的声音数据包含预设激活指令时触发所述语音采集模块对用户语音进行采集。
  13. 如权利要求11所述的线下远场语音控制设备,其中,所述麦克风至少有两个。
  14. 如权利要求11所述的线下远场语音控制设备,其中,所述线下远场语音控制系统还包括显示器,与所述控制芯片连接,被配置为显示所述操作对应的界面。
  15. 如权利要求10所述的线下远场语音控制设备,其中,所述线下远场语音控制系统还包括显示器,与所述控制芯片连接,被配置为显示所述操作对应的界面。
  16. 一种线下远场语音控制设备,其中,所述线下远场语音控制设备应用线下远场语音控制方法;所述线下远场语音控制方法包括以下步骤:
    对用户语音进行采集,获得对应的语音数据;
    确定预设语音库中存在所述语音数据;
    对所述语音数据进行编译压缩,获得操作指令;以及,
    根据所述操作指令控制电视机实现对应的操作。
  17. 如权利要求16所述的线下远场语音控制设备,其中,所述根据所述操作指令控制电视机实现对应的操作的步骤,包括:
    将所述操作指令与预设控制指令库进行匹配;以及,
    确定匹配成功,根据所述预设控制指令库中所述操作指令对应匹配的控制指令实现对应的操作。
  18. 如权利要求17所述的线下远场语音控制设备,其中,所述将所述操作指令与预设控制指令库进行匹配的步骤,包括:
    将所述操作指令与所述预设控制指令库中的控制指令进行字符串相似度匹配;以及,
    确定所述操作指令与控制指令的字符串相似度在第一预设范围内时匹配成功。
  19. 如权利要求17所述的线下远场语音控制设备,其中,所述对用户语音进行采集,获得对应的语音数据的步骤,包括:
    实时侦测用户发出的声音,并在确定用户发出的声音对应的声音数据包含预设激活指令时对用户语音进行采集,获得对应的语音数据。
  20. 如权利要求16所述的线下远场语音控制设备,其中,所述对用户语音进行采集,获得对应的语音数据的步骤,包括:
    实时侦测用户发出的声音,并在确定用户发出的声音对应的声音数据包含预设激活指令时对用户语音进行采集,获得对应的语音数据。
PCT/CN2019/130505 2019-07-16 2019-12-31 线下远场语音控制系统、控制方法及设备 WO2021008095A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910644412.1 2019-07-16
CN201910644412.1A CN110379422B (zh) 2019-07-16 2019-07-16 线下远场语音控制系统、控制方法及设备

Publications (1)

Publication Number Publication Date
WO2021008095A1 true WO2021008095A1 (zh) 2021-01-21

Family

ID=68253516

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130505 WO2021008095A1 (zh) 2019-07-16 2019-12-31 线下远场语音控制系统、控制方法及设备

Country Status (2)

Country Link
CN (1) CN110379422B (zh)
WO (1) WO2021008095A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110379422B (zh) * 2019-07-16 2022-05-20 深圳创维-Rgb电子有限公司 线下远场语音控制系统、控制方法及设备
CN111679594A (zh) * 2020-06-24 2020-09-18 四川长虹电器股份有限公司 一种可视化智能产品的远场语音开关控制方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101211504A (zh) * 2006-12-31 2008-07-02 康佳集团股份有限公司 一种通过语音对电视机进行遥控的方法、系统及设备
US20120226502A1 (en) * 2011-03-01 2012-09-06 Kabushiki Kaisha Toshiba Television apparatus and a remote operation apparatus
CN105869636A (zh) * 2016-03-29 2016-08-17 上海斐讯数据通信技术有限公司 一种语音识别装置及其方法、一种智能电视及其控制方法
CN109788398A (zh) * 2017-11-10 2019-05-21 阿里巴巴集团控股有限公司 用于远场语音的拾音装置
CN110379422A (zh) * 2019-07-16 2019-10-25 深圳创维-Rgb电子有限公司 线下远场语音控制系统、控制方法及设备

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101902587A (zh) * 2009-06-01 2010-12-01 沈阳同方多媒体科技有限公司 一种通过语音对电视机进行控制的系统
US9443516B2 (en) * 2014-01-09 2016-09-13 Honeywell International Inc. Far-field speech recognition systems and methods
CN104216351B (zh) * 2014-02-10 2017-09-29 美的集团股份有限公司 家用电器语音控制方法及系统
KR101989106B1 (ko) * 2017-03-31 2019-06-13 엘지전자 주식회사 홈 어플라이언스, 음성 인식 모듈 및 홈 어플라이언스 시스템
CN107566874A (zh) * 2017-09-22 2018-01-09 百度在线网络技术(北京)有限公司 基于电视设备的远场语音控制系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101211504A (zh) * 2006-12-31 2008-07-02 康佳集团股份有限公司 一种通过语音对电视机进行遥控的方法、系统及设备
US20120226502A1 (en) * 2011-03-01 2012-09-06 Kabushiki Kaisha Toshiba Television apparatus and a remote operation apparatus
CN105869636A (zh) * 2016-03-29 2016-08-17 上海斐讯数据通信技术有限公司 一种语音识别装置及其方法、一种智能电视及其控制方法
CN109788398A (zh) * 2017-11-10 2019-05-21 阿里巴巴集团控股有限公司 用于远场语音的拾音装置
CN110379422A (zh) * 2019-07-16 2019-10-25 深圳创维-Rgb电子有限公司 线下远场语音控制系统、控制方法及设备

Also Published As

Publication number Publication date
CN110379422A (zh) 2019-10-25
CN110379422B (zh) 2022-05-20

Similar Documents

Publication Publication Date Title
US11631403B2 (en) Apparatus, system and method for directing voice input in a controlling device
US9363549B2 (en) Gesture and voice recognition for control of a device
KR102147346B1 (ko) 디스플레이 장치 및 그의 동작 방법
RU2677396C2 (ru) Устройство отображения, устройство захвата речи и соответствующий способ распознавания речи
US20130099954A1 (en) Virtual universal remote control
JP7051799B2 (ja) 音声認識制御方法、装置、電子デバイス及び読み取り可能な記憶媒体
KR20130016025A (ko) 음성 인식 및 모션 인식을 이용한 전자 장치의 제어 방법 및 이를 적용한 전자 장치
US20200213653A1 (en) Automatic input selection
WO2021008095A1 (zh) 线下远场语音控制系统、控制方法及设备
US20140343952A1 (en) Systems and methods for lip reading control of a media device
US20070216538A1 (en) Method for Controlling a Media Content Processing Device, and a Media Content Processing Device
EP1307875A1 (en) System for controlling an apparatus with speech commands
CN108307238A (zh) 一种视频播放控制方法、系统及设备
KR20140009002A (ko) 외부 입력 제어 방법 및 이를 적용한 방송 수신 장치
WO2020177687A1 (zh) 一种模式设置方法、装置、电子设备及存储介质
CN104423992A (zh) 显示器语音辨识的启动方法
KR20210025812A (ko) 전자장치, 디스플레이장치 및 그 제어방법
WO2019136065A1 (en) Apparatus, system and method for directing voice input in a controlling device
CN112333505A (zh) 一种遥控器及音视频播放系统
CN111160318B (zh) 电子设备控制方法及装置
WO2017128040A1 (zh) 头戴式设备、耳机装置及头戴式设备分离控制方法
TW201510770A (zh) 顯示器語音辨識的啟動方法
KR102089593B1 (ko) 디스플레이 장치, 및 이의 제어 방법, 그리고 음성 인식 시스템의 디스플레이 장치 제어 방법
WO2021086420A1 (en) System and method for volume control in an audio or audiovisual device
CN112397069A (zh) 一种语音遥控方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19937959

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19937959

Country of ref document: EP

Kind code of ref document: A1