CN101516005A - Speech recognition channel selecting system, method and channel switching device - Google Patents

Speech recognition channel selecting system, method and channel switching device Download PDF

Info

Publication number
CN101516005A
CN101516005A CNA2008100654170A CN200810065417A CN101516005A CN 101516005 A CN101516005 A CN 101516005A CN A2008100654170 A CNA2008100654170 A CN A2008100654170A CN 200810065417 A CN200810065417 A CN 200810065417A CN 101516005 A CN101516005 A CN 101516005A
Authority
CN
China
Prior art keywords
channel
speech
voice
recognition
user
Prior art date
Application number
CNA2008100654170A
Other languages
Chinese (zh)
Inventor
吴治国
张勤伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CNA2008100654170A priority Critical patent/CN101516005A/en
Publication of CN101516005A publication Critical patent/CN101516005A/en

Links

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08CTRANSMISSION SYSTEMS FOR MEASURED VALUES, CONTROL OR SIMILAR SIGNALS
    • G08C23/00Non-electrical signal transmission systems, e.g. optical systems
    • G08C23/02Non-electrical signal transmission systems, e.g. optical systems using infrasonic, sonic or ultrasonic waves
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42204User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor
    • H04N21/42206User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor characterized by hardware details
    • H04N21/42222Additional components integrated in the remote control device, e.g. timer, speaker, sensors for detecting position, direction or movement of the remote control, microphone or battery charging device
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network, synchronizing decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/44Receiver circuitry
    • H04N5/4403User interfaces for controlling a television receiver or set top box [STB] through a remote control device, e.g. graphical user interfaces [GUI]; Remote control devices therefor
    • GPHYSICS
    • G08SIGNALLING
    • G08CTRANSMISSION SYSTEMS FOR MEASURED VALUES, CONTROL OR SIMILAR SIGNALS
    • G08C2201/00Transmission systems of control signals via wireless link
    • G08C2201/30User interface
    • G08C2201/31Voice input
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Taking into account non-speech caracteristics
    • G10L2015/228Taking into account non-speech caracteristics of application context
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/44Receiver circuitry
    • H04N5/4403User interfaces for controlling a television receiver or set top box [STB] through a remote control device, e.g. graphical user interfaces [GUI]; Remote control devices therefor
    • H04N2005/4405Hardware details of remote control devices
    • H04N2005/4432Remote control devices equipped or combined with PC-like input means, e.g. voice recognition or pointing device

Abstract

The invention provides a speech recognition channel selecting system, a method and a channel switching device. The method comprises that: a controller receives a speech input signal of a user; the channel switching device recognizes a name to be matched according to the input speech signal and a recognition word list; the name to be matched is matched with a match list to acquire a channel needing to be switched; and the channel needing to be switched is switched. The system, the method and the device avoid the problems of complex speech recognition operation on the controller and high cost, are quite convenient for the user to operate, and make full use of the property of the channel switching device to save the control cost. The system, the method and the device recognize the name to be matched through the channel switching device, do not need to arrange a special speech recognition server in a network, prevent overlong response time, avoid the problem of loss of network transmission data, and save the cost for constructing the network.

Description

一种语音识别频道选择系统、方法及频道转换装置 A speech recognition system channel selection, channel switching method and apparatus

技术领域 FIELD

本发明涉及通信技术领域,尤其涉及一种通过语音识别进行频道选择系统、装置及方法。 The present invention relates to communications technologies, and in particular relates to a channel selection system, apparatus and method by voice recognition.

背景技术 Background technique

近年来随着信息技术和广播电视技术的发展,有线数字电视和IPTV等业务发展迅速。 With the development of information technology and broadcasting technology, digital cable TV and IPTV business has developed rapidly. 在机顶盒(Set-top Box, STB),如IP机顶盒和数字机顶盒等,逐步市场化的趋势下,机顶盒的完备功能逐渐取代了传统的VCD机和DVD机。 In the gradual market trends set-top boxes (Set-top Box, STB), such as IP set-top boxes and digital set-top boxes, complete function set-top boxes gradually replace the traditional VCD and DVD player. 另一方面,随着自动语音识别技术的发展,使得机顶盒通过语音来选择频道成为可能,该技术也成为业界研发的重点。 On the other hand, with the development of automatic speech recognition technology, making the set-top box to select a channel by voice possible, the technology industry has become the focus of research and development.

传统的语音识别选择频道有两种方式: 一种是通过在遥控器上增加语音识别处理器的方式,在识别时通过用户输入下载语音模板与用户输入的语音数据匹配确定的语音数据来转换频道; 一种是通过在网络中设置专门的语音识别服务器。 Traditional speech recognition selects a channel in two ways: one is by adding a voice recognition processor in the remote control mode, upon identifying the channel to convert the input speech data download speech utterance matches the data input by the user is determined by a user ; it is through a dedicated voice recognition server provided on the network.

发明人在实现本发明的过程中发现传统的语音识别选择频道的方式至少存在如下缺点:通过在遥控器上增加语音识别处理器的方式,由于在识别时每次更新语音模板都需要用户手动操作下载到遥控器上,操作起来十分复杂、不便,同时,也增加了遥控器的成本;通过在网络中设置专门的语音识别服务器的方式,由于识别语音时需要将语音信号上传到网络,响应时间较长,并且通过网络上行和下行传输两次数据包丢失的可能性也会增大,另外专门的语音识别服务器也增加了构建网络的成本。 The inventors found that in the process of implementing the present invention, at least the following disadvantages of conventional voice recognition mode channel selection: by increasing the speech recognition processor in the remote control mode, since each update utterance is recognized require user manual downloaded to the remote control, the operation is very complicated, inconvenient, but also increases the cost of the remote control; by providing a dedicated voice recognition server in the network, since when recognizing speech need to upload a voice signal to the network, response times long, and the possibility of the network uplink and downlink transmission of two packet loss will increase, additional specialized speech recognition server also increases the cost of network construction.

发明内容 SUMMARY

有鉴于此,实有必要提供一种操作方便、节省成本的语音识别频道选择方法。 In view of this, there is a need to provide a convenient operation, saving the voice recognition method of channel selection cost.

同时,提供一种操作方便、节省成本的语音识别频道转换系统。 At the same time, provide a convenient operation, cost speech recognition channel conversion system. 同时,提供一种操作方便、节省成本的频道转换装置。 At the same time, provide a convenient operation, cost-channel conversion means.

一种语音识别频道选择方法,包括如下步骤:控制器接收用户的语音输入信号; A speech recognition channel selecting method, comprising the steps of: the controller receives the user's speech input signal;

频道转换装置根据输入的语音信号及识别词表识别出待匹配名称;根据所述待匹配名称与匹配表进行匹配得出需要切换的频道;切换到所述需要切换的频道。 Channel identification means converting an input speech signal and a recognition vocabulary to be matched name; obtained according to the channel to be switched to the table to be matched with the matching name matching; switching to the channel to be switched.

一种语音识别频道选择系统,包括:控制器,用于与频道转换处理装置进行通信; A speech recognition system channel selection, comprising: a controller for converting a channel communication processing apparatus;

所述控制器用于接收用户的语音输入信号; A controller for receiving a user's speech input signal;

所述频道转换处理装置用于根据所述输入的语音输入信号及识别词表识别出待匹配名称,根据所述待匹配名称与匹配表进行匹配得出需要切换的频道,并切换到所述需要切换的频道。 The channel conversion processing means for identifying a name to be matched input signal and the speech recognition vocabulary of the input channel to be switched according to results to be matched to the matching table name matches, and the need to switch switching channels.

一种频道转换装置,包括: A channel conversion apparatus, comprising:

接收模块,用于接收控制器发送的用户的语音输入信号;识别处理模块,用于根据所述输入的语音输入信号及识别词表识别出待匹配名称; Receiving means for receiving a user's voice input signal transmitted by the controller; recognition processing module for identifying a name to be matched according to the speech input signal input and recognition vocabulary;

查询匹配模块,用于根据所述待匹配名称与匹配表进行匹配得出需要切换的频道; Query matching module, configured to obtain a channel to be switched according to the table to be matched with the matching name matching;

频道转换控制模块,用于切换到所述需要切换的频道。 Channel switching control module, for switching the channel to be switched. 与现有技术相比,本发明实施例通过控制器接收用户的语音输入信号,通过频道转换装置根据所述输入的语音输入信号识别出待匹配名称,根据所述待匹配名称与匹配表进行匹配得出需要切换的频道,并切换到所述需要切换的频道,避免了在控制器上进行语音识别操作复杂和成本高的问题,使得用户在搡作起来十分方便,并且充分利用频道转换装置的性能,节省了控制的成本。 Compared with the prior art, embodiments of the present invention receives the user's speech input signal by the controller, identified by the channel switching means to be matched according to the name of the speech input signal input, matched according to the matching table name to be matched obtain a channel to be switched, and switching to the channel switching is required to avoid the speech recognition operation on the controller complexity and high cost, so that the user is very convenient for the shoving, and make full use of the channel conversion device performance, cost savings control. 通过频道转换装置识别出待匹配名称,不需要在网络中设置专门的语音识别服务器,防止响应时间过长,避免了由于网络传输数据丢失的问题,并且节约了构建网络的成本。 Identified by the channel switching means to be matched name, no need to provide a dedicated voice recognition server in the network, to prevent the response time is too long, to avoid the cost of building the network as network data loss, and saves.

附图说明 BRIEF DESCRIPTION

图1为本发明实施例语音识别频道转换系统结构示意图。 FIG 1 is a schematic embodiment of a voice recognition system configuration of channel conversion embodiment of the present invention.

图2为本发明实施例控制器结构示意图。 FIG 2 is a schematic diagram of the controller structure embodiment of the present invention.

图3为本发明实施例频道转换处理装置结构示意图。 FIG 3 is a schematic configuration example of a channel conversion processing device of the present embodiment of the invention.

图4为本发明实施例语音识别频道选择方法流程图。 Figure 4 flowchart of speech recognition method for selecting a channel embodiment of the present invention. 图5为本发明实施例频道和节目表更新方法流程图。 Figure 5 table update channel and program flowchart of a method embodiment of the present invention. 图6为本发明实施例识别词表和匹配表更新方法流程图。 6 and a flowchart of recognition vocabularies matching table update method of the present invention.

具体实施方式 Detailed ways

请参看图l,本发明实施例语音识别频道转换系统100包括:控制器102、频道转换装置104和电子节目指南(Electronic Program Guide, EPG)服务器106。 See Figure L, embodiments of the present invention a voice recognition channel conversion system 100 comprising: a controller 102, a channel switching apparatus 104 and an electronic program guide (Electronic Program Guide, EPG) server 106. 控制器102,用于接收用户的语音输入信号。 The controller 102, for receiving a user's speech input signal. 频道转换装置104,用于根据输入的语音输入信号及识别词表识别出待匹配名称,根据待匹配名称与匹配表进行匹配得出需要切换的频道,并切换到需要切换的频道。 Channel switching means 104, the name to be matched for recognizing the speech input signal and input the recognized word list, the name to be matched according to the matching table to match the channel to be switched, stars, and switches to the channel to be switched. EPG服务器106,用于提供待更新的最新的匹配表和/或最新的更新的识别词表,频道转换装置104可以根据最新的匹配表更新匹配表,和/或才艮据最新的识别词表更新识别词表。 EPG server 106 for providing updated to be the latest matching table and / or the latest update of the recognized word list, channel switching device 104 can match table to update the latest matching table, and / or before the latest data Gen table according to the recognized word update the recognition vocabulary. 控制器102可以是系统外接控制器、HS (Handset,手机)或遥控器,本实施例中,以遥控器为例。 The controller 102 may be an external system controller, HS (Handset, cell phone) or remote control, in this embodiment, as an example to the remote. 频道转换装置104可以是PC( Personal Computer,个人电脑)、STB( Set-top Box,才几顶盒)、NB( NotebookComputer,笔记本电脑)、HS (Handset,手才几)、GP ( Game Player,游戏机)或ODD ( Optical Disc Drive,光碟机)等,本实施例中,以STB为例进行说明。 Channel switching device 104 may be a PC (Personal Computer, PC), STB (Set-top Box, only a few top box), NB (NotebookComputer, laptops), HS (Handset, hands only a few), GP (Game Player, the gaming machine) or ODD (Optical disc Drive, CD-ROM drive), etc., in the present embodiment, an example will be described to STB.

请结合参看图2,本实施例中,控制器102包括:语音接收模块202、语音信号处理模块204、输入模块210、控制器接收模块212和发送模块216。 Please Referring to Figure 2 in conjunction with the present embodiment, the controller 102 includes: a voice receiving module 202, a voice signal processing module 204, an input module 210, the controller 212 receiving module 216 and a sending module. 语音信号接收模块202,用于接收用户的语音输入信号,本实施例中,语音输入模块可以是一个遥控器上的麦克风。 Voice signal receiving module 202 for receiving a user's speech input signal, in this embodiment, the voice input module may be a microphone on a remote control.

语音信号处理模块204,用于处理用户的语音输入信号。 Speech signal processing module 204 is configured to process user speech input signal. 语音信号处理才莫块204还包括:语音转换单元206和语音编码单元208。 Speech signal processing block 204 only Mo further comprising: a voice conversion unit 206 and the speech encoding unit 208. 语音转换单元206,用于将语音信号转换成为数字信号,本实施例中,语音转换单元206可以是A/D转换电路。 Voice converting means 206 for converting the voice signal into a digital signal, in this embodiment, the speech converting unit 206 may be a A / D conversion circuit. 语音编码单元208,用于编码语音转换单元206转换后的数字信号,该编码可以是压缩编码,包括有损压缩编码或无损压缩编码。 Speech encoding unit 208 for encoding a digital signal converting unit 206 converts the speech, which may be compression-coded coding, including encoding or lossy compression lossless compression-encoding. 用户的语音采集和处理可以有不同的方案,本实施例中,以16KHz采样率进行采样,按16或8bit的精度进行量化。 User's speech acquisition and processing may have different options, in this embodiment, is sampled to 16KHz sampling rate, and quantizing precision of 16 or 8bit. 语音信号经过采样和处理后的编码格式为PCM ( Pulse Code Modulation,脉冲编码调制)格式。 After sampling a speech signal coding format and processed as PCM (Pulse Code Modulation, Pulse Code Modulation) format.

输入模块210,用于接收用户输入的指令,如,语音激活指令,用于控制频道转换装置激活语音,本实施例中,输入模块210可以是键盘或触摸屏。 An input module 210, instructions for receiving user input, such as voice-activated commands, for controlling the switching means to activate the voice channel, the present embodiment, the input module 210 may be a keyboard or touch screen.

控制器接收模块212,用于接收频道转换装置104发送的信号,该信号包括返回的指令信号和通知消息等。 The controller receiving module 212 for receiving a signal transmitted from a channel conversion apparatus 104, the signal comprises a command signal and the like notification message is returned.

发送模块216,用于发送用户输入的语音编码后的信号和操作信号,本实施例中,发送模块216可以是红外、蓝牙等无线通讯装置,如可以通过Bluetooth2.0(蓝牙2.0技术),紫蜂Zigbee或高速红外协议等能够保证PCM(Pulse Code Modulation,脉冲编码调制)语音数据能够实时传输的高速无线通信技术。 Sending module 216, the signal and the operation signal input by the user for speech coding transmission, according to the present embodiment, the transmitting module 216 may be an infrared, Bluetooth and other wireless communication devices, such as via Bluetooth2.0 (Bluetooth 2.0), purple bee or Zigbee protocol ensures high-speed infrared PCM (pulse Code modulation, pulse Code modulation) high-speed wireless communication technology capable of real-time transmission of voice data. 发送模块216还包括:操作信号发送单元218,用于发送用户输入的操作信号,例如,键盘输入和触摸屏输入信号。 Sending module 216 further comprises: an operation signal transmitting unit 218, an operation signal for transmitting user input, e.g., keyboard and touch screen input signal. 语音信号发送单元214,用于发送用户输入的语音信号,该信号为经过A/D转换的数字信号,也可以是压缩编码后的信号。 The voice signal transmitting unit 214 for transmitting a voice signal input by a user, the signal is a digital signal through the A / D conversion, it may be a coded signal compression.

请结合参看图3,本实施例中,频道转换装置104 (STB)包括:接收模块302、静音控制模块308、语言选择模块310、识别处理模块312、发送模块322、拒绝识别提示模块324、存储模块326、查询匹配模块336、频道转换控制模块338和更新模块340。 Please binding Referring to Figure 3, in this embodiment, the channel switching means 104 (STB) comprises: a receiving module 302, a mute control module 308, language selection module 310, recognition processing module 312, transmission module 322, a rejection Identification Presentation module 324, a storage module 326, query matching module 336, a channel switching control module 338, and update module 340.

接收模块302,用于接收控制器发送的用户的语音输入信号和用户的操作控制指令,本实施例中,用户输入信号包括用户的语音输入信号和用户的操作控制指令,若全部为语音输入,也可以不包含用户控制指令信号。 A receiving module 302, a user's voice signal sent by a controller and a user's operation input control command, in this embodiment, the user input signal comprises a voice signal and user input operation of the user control command, if all speech input, It may not include a user control command signal. 用户 user

的语音输入信号为经过模拟/数字A/D转换后的数字语音信号。 Voice input signal into a digital speech signal through an analog / digital A / D conversion. 接收模块302还包括操作信号接收单元304和语音信号接收单元306。 Receiving module 302 further includes an operation signal receiving unit 304 and a voice signal receiving unit 306. 操作信号接收单元304用于接收用户的操作控制指令,例如激活语音控制指令。 Operation signal receiving means 304 for receiving a user operation control commands, such as activating a voice control command. 语音信号接收单元306用于接收用户的语音输入信号。 Speech signal receiving unit 306 for receiving a user's speech input signal.

静音控制模块308,用于根据用户输入的激活语音的指令,将频道转换装置置为静音状态,及在语音采集后将静音状态切换为非静音状态。 Mute control module 308, configured to activate the voice instruction input by the user, the channel switching means is set to the mute state, and switching the voice collecting the non-mute state after a mute state.

语言选择模块310,用于根据用户输入的语言选择信号,选择一个与所述语言选择信号对应的声学模型。 Language selection module 310 for selecting the language according to a user input signal, and selecting a selection signal corresponding to the acoustic model and language.

识别处理模块312,用于根据输入的语音信号及识别词表识别出待匹配名称。 Recognition processing module 312, a name to be matched for recognizing an input speech signal and a recognition vocabulary. 识别处理模块312包括:语音激活;险测单元314、语音特征提取单元316、语音识别单元318和语音判断单元320。 Recognition processing module 312 includes: a voice activated; danger sensing unit 314, the voice feature extracting unit 316, a voice recognition unit 318 and the voice judgment unit 320.

语音激活检测单元314,用于检测实际语音段的起点和终点。 Voice activity detection means 314 for detecting the actual start and end points of speech segments. 本实施例中,语音激活检测单元314采用稳健的端点检测算法检测出实际语音的起点和终点,以区分出输入的语音信号中实际语音段和非语音段。 In this embodiment, voice activity detection unit 314 uses robust endpoint detection algorithm to detect the start and end of the actual speech, the input speech signal to distinguish the actual speech segments and non-speech segments.

语音特征提取单元316,用于将语音信号进行语音特征提取。 Speech feature extraction unit 316, the voice signal for speech feature extraction. 本实施例中,语音特征提取单元316将话音激活检测单元314传送过来的语音信号进行处理,提取出语音特征数据。 In this embodiment, the speech feature extraction unit 316 transmitted from the voice activity detector speech signal processing unit 314, the extracted voice feature data. 语音特征类型可以采用MFCC( Mel-FrequencyCeptral Coefficients,美尔频率倒谱系数)特征,PLP (Perceptually LinearPrediction,感知线性预测)特^正或LPCC ( Linear Predictive Cepstral Coding,线性预测倒谱系数)特征,为了提高抗噪效果,可以在语音特征提取过程中运用倒谱均值减的处理。 Speech feature type may be employed MFCC (Mel-FrequencyCeptral Coefficients, Mel frequency cepstrum coefficient) characteristics, PLP (Perceptually LinearPrediction, perceptual linear prediction) Laid ^ n or LPCC (Linear Predictive Cepstral Coding, linear prediction cepstrum coefficient) characteristics, in order improve the anti-noise effect, the use of process cepstral mean subtraction may be extracted speech features. 考虑到MFCC特征利用了人耳的声学感知特性而对噪音具有较好的稳健性,优选MFCC特征作为语音特征。 MFCC feature considering the use of acoustic perception characteristics of the human ear has better robustness to noise, preferably as a speech feature MFCC feature. 语音信号作为短时平稳信号,语音帧之间具有帧间相关性,为此可以对MFCC特征提取一阶差分或一阶及二阶差分来提高语音识别的准确率。 A stationary signal short-time speech signals, inter-frame correlation between speech frames may be extracted for this purpose a first-order difference or differential pair and second order MFCC feature to improve the accuracy of speech recognition.

语音识别单元318,用于根据声学模型和识别词表计算出输入的语音特征数据相对于词条的声学距离。 A voice recognition unit 318, an input speech feature data for computing an acoustic model and recognition vocabulary entries with respect to the acoustic distance. 本实施例中,语音识别单元318根据声学模 In this embodiment, the voice recognition unit 318 according to the acoustic mode

型数据和孤立词表数据得到每个孤立词的最短累积声学距离,然后取最短声学距离最小的孤立词作为该语音首选识别结果。 Data obtained and isolated word table data of each isolated word acoustic shortest cumulative distance, and then take the shortest distance from the minimum acoustic isolated word recognition result of the voice as the preferred. 语音识别采用的声学模型包 Speech recognition acoustic model package used

括连续的HMM (Hidden Markov Model隐含马尔可夫模型)模型和离散HMM才莫型。 Including continuous HMM (Hidden Markov Model HMM) model and the discrete HMM was Mo type. 此外,语音识别单元318还可以给出多个候选的识别结果让用户选择,排序的依据为最短累积声学距离。 Further, the voice recognition unit 318 may also be given a plurality of recognition result candidates allows the user to select, based on the sort of acoustic shortest cumulative distance.

语音判断单元320,用于判断语音特征数据相对于词条的声学距离是否小于阈值,若语音特征数据相对于词条的声学距离小于阈值,根据识别词表和匹配表计算出当前语音对应的频道名称。 Speech determination unit 320 for determining whether the speech feature data with respect to the acoustic distance entry is less than the threshold, if the speech feature data with respect to the acoustic distance entry is less than the threshold, corresponding to the calculated current voice channel based on the recognition vocabulary and a matching table name.

发送模块322,用于向控制器102发送识别处理信号,在识别处理完毕后,控制器102可以停止采集用户的语音输入信号。 Sending module 322, configured to send a signal to the controller 102 the identification process, after the completion of the recognition process, the controller 102 may stop collecting the user's speech input signal. 本实施例中,发送模块322也可以采用蓝牙、红外等无线方式传送信号。 In this embodiment, the transmission module 322 may be Bluetooth, infrared and other wireless transmission signals.

拒绝识别提示模块324,用于在识别结果为非语音时,提示用户重新输入语音。 Refused prompt recognition module 324, recognition result is used when the non-voice, prompt the user to input speech. 该提示可以是消息提示、视频显示提示或声音提示,本实施例中, 采用在屏幕上显示提示文字的方式提示用户。 The prompt may be a message prompt, voice prompts or prompt a video display, the present embodiment using text prompts on a screen prompt for the user.

存储模块326,用于存储频道和节目表、识别词表、声学模型和匹配表等数据。 A storage module 326 for storing a program table and the channel identification vocabulary, acoustic models and matching data table or the like. 本实施例中,存储模块326包括:频道和节目表存储单元328、识别词表存储单元330、声学模型存储单元332、匹配表存储单元334。 And a program table storage unit channels 328, the recognized word table storage unit 330, the acoustic model storage unit 332, a matching table storage unit 334: embodiment, memory module 326 includes the present embodiment.

频道和节目表存储单元328,用于存储频道和节目对应表,本实施例中, 表的每一个记录项为直#番电-见的频道名称以及当前时刻该频道正在^番;故的 And a program table storage unit a channel 328 for storing a program and a channel correspondence table, in this embodiment, each of the table entries is a linear electric fan # - see the channel name and the current channel is the time Fan ^; it is

节目名称。 Program Title. 该频道和节目对应表可以根据EPG服务器106更新,更新周期可以设置为一天或一个星期,具体的时间间隔可以参考IPTV或有线数字电视系统的EPG服务器更新间隔。 The channel program map table and the EPG server 106 may be updated, the update period may be set to one day or one week, the time interval may refer to the specific IPTV or cable digital television system EPG server update interval.

识别词表存储单元330,用于存储识别词表,本实施例中,识别词表还包括一张用于孤立词语音识别的《瓜立词表。 Recognition word list storage unit 330 for storing a recognition vocabulary, the present embodiment further includes a recognition vocabulary isolated word speech recognition for "melon stand vocabularies.

声学模型存储单元332,用于存储待匹配的声学模型。 The acoustic model storage unit 332, for storing the acoustic model to be matched. 本实施例中,采用包含针对HMM模型的双语种混合建模的声学模型的模型参数。 In this embodiment, using the model parameters for the acoustic model comprising a hybrid modeling of bilingual HMM model. 双语种混合声学模型的参数与说话人无关,即为针对非特定人的模型。 Bilingual mix of acoustic model parameters and speaker-independent, that is the model for non-specific person. 模型参数需要事先根据标注好的预料数据经过训练器进行训练,训练得到的参数就可以固化到声学模型参数存储部用于孤立词的语音识别,声学模型参数包括隐含马尔可夫模型的状态参数和状态输出观测特征矢量的概率分布函数。 The model parameters need to be labeled in advance through the trainer good expected training data, the training of the parameters obtained can be cured to a speech recognition acoustic model parameter storage unit for isolated word, the acoustic model parameters include HMM state parameters and status output probability distribution function of the observation feature vector.

匹配表存储单元334,用于存储匹配表,匹配表存储了用户需要切换的频道与用户的语音输入的频道对应关系。 Matching table storage unit 334 for storing the matching table the matching table stores the channel input by the user needs to switch the voice channel to the user's correspondence.

查询匹配模块336,用于根据待匹配的名称与匹配表进行匹配得出需要切换的频道。 Query matching module 336, for matching the channel to be switched according to results to be matched with the matching table name. 本实施例中,以识别出的孤立词作为查询关4定字,首先在频道节目表中查询所包含表的频道名列中查询符合关^t词的记录项。 In this embodiment, an isolated word recognition to the given word as a query off 4, first query table contains channel among the entries would satisfy the query words ^ t in the channel in the program table.

频道转换控制模块338,用于切换到需要切换的频道。 Channel switching control module 338, for switching the channel to be switched. 若存在匹配的记录项,查询结果为单个记录项时,控制机顶盒直播电视切换到记录项中频道名属性标识的频道;查询结果为多个记录时,控制电视屏幕显示多个记录项的频道名的属性值,并提示用户通过遥控器选择其中一个频道观看直播电视节目,待用户完成选择后,控制电视切换到用户选择的频道。 If there are matching entries, the query result is a single record entry, the control set-top box to switch to live TV channels in the channel name attribute record entry identifier; query result is a plurality of recording, channel name display control of the television screen of a plurality of entries attribute value, and prompts the user to select a channel to watch live television program, until the user completes selecting the control to switch the TV channel selected by the user through a remote controller.

更新模块340,用于根据EPG服务器跟新匹配表和/或识别词表。 Updating module 340, based on the EPG server for matching table with the new and / or recognition vocabulary. 更新才莫块340还包括:更新定时单元342和更新控制单元344。 Mo only update block 340 further comprising: updating a timing control unit 342 and update unit 344. 更新定时单元342, 用于记录更新的时间,并在更新时间到达或超时时,触发更新,本实施例中, 频道和节目表更新时间可以设置为每天更新,识别词表和匹配表更新时间可以设置为每分钟更新。 Updating timing unit 342, the update time for recording, and when the update time arrives or times out, trigger an update, in this embodiment, the channel and the program table can be set to update the update time per day, and matching the recognized word list table update time may be set to update every minute. 更新控制单元344,用于在满足更新时间时,控制更新匹配表和/或识别词表。 Update control unit 344, configured to update time is satisfied, the control updates the matching table and / or recognition vocabulary.

本发明实施例通过控制器接收用户的语音输入信号,通过频道转换装置 Example user speech input signal received by the controller of the present embodiment of the invention, the channel switching means

根据所述输入的语音输入信号识别出待匹配名称,根据所述待匹配名称与匹 The input of the speech input signal to be recognized match the name The name to be matched with the match

配表进行匹配得出需要切换的频道,并切换到所述需要切换的频道,避免了 Table with matching stars channel to be switched, and switches to the channel to be switched to avoid the

在控制器上进行语音识别操作复杂和成本高的问题,使得用户在操作起来十 Voice recognition operation complicated and the cost is high on the controller, so that the user to operate the ten

分方便,并且充分利用频道转换装置的性能,节省了控制的成本。 Convenient points, and takes full advantage of the channel conversion device, saving the cost control. 通过频道 By Channel

转换装置识别出待匹配名称,不需要在网络中设置专门的语音识别服务器, Conversion means matches the name to be recognized, no need to provide a dedicated voice recognition server in the network,

防止响应时间过长,避免了由于网络传输数据丟失的问题,并且节约了构建网络的成本。 Prevent the slow response time, network transmission avoids the data loss, and saves the cost of building a network. 本发明实施例通过截取实际语音段,语音识别的准确率得到提高。 Embodiments of the invention taken through actual speech segment, speech recognition accuracy is improved. 通过静音控制单元控制语音输入时,将机顶盒静音,防止电视播放的声 Mute control unit through the voice input, the set-top box muted to prevent the sound of the television

音对用户语音的干扰。 Tones of the user's voice. 通过更新模块从EPG服务器自动更新频道和节目表, And automatically updates the program table from the EPG channel by updating the server module,

识别词表和匹配表避免了用户手工造作带来操作不便的弊端。 Recognition vocabulary and a matching table to avoid the inconvenience of the user manual pretentious operating drawbacks.

请结合参看图4,本发明实施例语音识别频道选择方法,包括如下步骤: 步骤402,控制器接收用户输入的激活语音指令。 Please Referring to Figure 4 in conjunction, for example, speech recognition method of the present invention, channel selection, comprising the following steps: Step 402, the controller receives user input to activate the voice instruction. 本实施例中,语音激 Embodiment, the present embodiment of the speech excitation

活指令可以是用户输入的按键信号,用户可以通过键盘或触摸屏等输入设备 Deactivation command key signal may be input by a user, the user can touch screen or a keyboard input device

输入的指令信号。 Input command signal.

步骤404,控制器向频道转换装置发送启动语音识别控制指令信号。 Step 404, the controller device transmits a voice recognition start control command signal to the channel conversion. 本实施例中,以蓝牙、高速红外协议、紫蜂Zigbee等无线发送方式为例,通过遥控器向机顶盒发送启动语音识别控制指令信号。 In this embodiment, Bluetooth, infrared high-speed protocol, Zigbee Zigbee wireless transmission scheme as an example, the transmission start instruction signal to voice recognition control set-top box via the remote control.

步骤406,频道转换装置置为静音状态。 Step 406, the channel switching means is set to the mute state.

步骤408,频道转换装置向控制器发送启动语音采集控制指令信号。 Step 408, channel switching means for transmitting voice collecting start control command signal to the controller. 若不采用静音功能时,也可以不包括以上步骤,不再赘述。 If when using mute function, may not include the above steps will not be repeated.

步骤410,控制器接收用户的语音输入信号,采集和处理用户输入的语音信号,本实施例中,通过A/D转换器将模拟语音信号转换成数字语音信号,并通过无线方式传送给频道转换装置。 Step 410, the controller receives the user's speech input signal, acquisition and processing speech signals input by the user, in this embodiment, the A / D converter converts the analog voice signal into a digital voice signal and transmitted by radio to the channel conversion device.

步骤412,频道转换装置检测实际语音段的起点和终点,根据实际语音段的起点和终点用于识别出待匹配的名称。 Step 412, the start and end channel switching means actual speech segment detection according to the actual start and end points of speech segments to be matched to the name of the identified. 本实施例中,语音激活检测采用稳健的端点检测算法检测出实际语音的起点和终点,以区分出输入的语音信号中实际语音段和非语音段。 In this embodiment, voice activity detection using robust endpoint detection algorithm for detecting actual start and end of speech, the input speech signal to distinguish the actual speech segments and non-speech segments.

步骤414,频道转换装置向控制器发送停止语音采集控制信号。 Step 414, the channel switching control means transmits voice collecting stop signal to the controller. 在识别处理完毕后,控制器可以停止采集用户的语音输入信号。 After the completion of the recognition process, the controller may stop collecting user's voice input signal. 本实施例中,发送方式也可以采用蓝牙、高速红外协议和Zigbee等无线方式传送信号。 In this embodiment, the transmission mode may be Bluetooth, Zigbee protocol, and infrared and other high speed wireless transmission signals.

步骤416,控制器根据频道转换装置的停止语音采集控制信号的控制停止采集和处理语音信号。 Step 416, the controller stops the acquisition and processing the voice signals control stop voice collecting channel control signal conversion apparatus.

步骤418,将起点和终点之间的实际语音段的信号传送给语音特征提取 Step 418, the actual speech signal transmission period between the start and end to the speech feature extraction

12单元。 12 unit. 步骤418和步骤414可以没有先后关系,也可以先执行步骤418后执行步骤416,不再赘述。 Step 418 and step 414 may have no relationship, or you can perform to step 418 after step 416 will not be repeated.

步骤420,语音特征提取单元根据输入的语音信号提取语音特征,将语音信号进行特征提取,本实施例中,若之前有获取实际语音段落检测的步骤, 就只需提取实际语音段。 Step 420, the voice feature extracting unit extracting speech features according to a voice input signal, the speech signal feature extraction, in the present embodiment, if the step of acquiring the actual speech detection prior paragraph, can extract only the actual voice segment. 语音特征类型可以采用MFCC特征,PLP特征或LPCC特征,为了提高抗噪效果,可以在语音特征提取过程中运用倒谱均值减的处理。 MFCC speech feature type may be employed wherein, the PLP LPCC feature or features, in order to improve the anti-noise effect, the use of process cepstral mean subtraction may be extracted speech features. 考虑到MFCC特征利用了人耳的声学感知特性而对噪音具有较好的稳健性,优选MFCC特征作为语音特征。 MFCC feature considering the use of acoustic perception characteristics of the human ear has better robustness to noise, preferably as a speech feature MFCC feature. 语音信号作为短时平稳信号, 语音帧之间具有帧间相关性,为此可以对MFCC特征提取一阶差分或一阶及二阶差分来提高语音识别的准确率。 A stationary signal short-time speech signals, inter-frame correlation between speech frames may be extracted for this purpose a first-order difference or differential pair and second order MFCC feature to improve the accuracy of speech recognition.

步骤422,根据声学模型和识别词表计算出输入的语音特征数据相对于词条的声学距离。 Step 422, the input speech feature data is calculated according to the acoustic model and recognition vocabulary entries with respect to the acoustic distance. 本实施例中,语音识别根据声学模型数据和孤立词表数据得到每个孤立词的最短累积声学距离,然后取最短声学距离最小的孤立词作为该语音首选识别结果。 In this embodiment, each of the voice recognition to obtain the shortest isolated word model data based on acoustic and acoustic isolation from the accumulated table data word, and then take the shortest distance from the minimum acoustic isolated word recognition result of the voice as the preferred. 语音识别采用的声学模型包括连续HMM模型和离散HMM模型。 Speech recognition using the acoustic model includes a continuous HMM and discrete HMM model model. 此外,语音识别还可以给出多个候选的识别结果让用户选择, 排序的依据为最短累积声学距离。 Further, the voice recognition may also be given a plurality of recognition result candidates allows the user to select, based on the sort of acoustic shortest cumulative distance. 本实施例中,采用包含针对HMM的双语种混合建模的声学模型的模型参数。 In this embodiment, it comprises using the model parameters for an HMM acoustic model double hybrid modeling language. 双语种混合声学模型的参数与说话人无关,即为针对非特定人的模型。 Bilingual mix of acoustic model parameters and speaker-independent, that is the model for non-specific person. 模型参数需要事先根据标注好的预料数据经过训练器进行训练,训练得到的参数就可以固化到声学模型参数存储部用于孤立词的语音识别,声学模型参数包括HMM的状态参数和状态输出观测特征矢量的概率分布函数。 Requires prior training model parameters according trained expected, the tagged data, the trained parameter can be cured to a speech recognition acoustic model parameter storage unit for isolated word, the acoustic model parameters and status output state parameter comprising the HMM observation feature vector probability distribution function. 本步骤之前,还可以包含根据用户输入的语言选择信号,选择一个与该语言选择信号对应的声学模型的步骤。 Before this step, may further comprise a user input signal according to the language selection, the step signal corresponding to the acoustic model selecting a choice with the language.

步骤424,判断语音特征数据相对于每个词条声学距离是否小于阈值, 若声学距离不小于阈值,执行步骤426;若声学距离小于阈值,执行步骤428。 Step 424, it is determined whether the speech feature data with respect to each acoustic entry less than the threshold distance, when the acoustic distance is not less than the threshold, step 426 is performed; if the distance is less than the acoustic threshold value, the step 428.

步骤426,若语音特征数据相对于词条的声学距离大于或等于阈值,识别结果为非语音,提示用户重新输入。 Step 426, if the speech feature data with respect to the entry of the acoustic distance is greater than or equal to the threshold, a non-voice recognition result, the user is prompted to re-enter. 该提示可以是消息提示、视频显示提示或声音提示,本实施例中,采用在屏幕上显示提示文字的方式提示用户。 The prompt may be a message prompt, voice prompts or prompt a video display, the present embodiment using text prompts on a screen prompt for the user. 执行完步骤426后,结束本识别过程。 After step 426, the end of the recognition process.

步骤428,若语音特征数据相对于词条的声学距离小于阈值,根据识别词表和匹配表计算出当前语音对应的频道名称。 Step 428, if the speech feature data with respect to the acoustic distance entry is less than the threshold, the voice name of the current channel calculated based on the recognition vocabulary corresponding to and matching table. 本实施例中,根据声学模型数据和孤立词表数据得到每个孤立词的最短累积声学距离,然后取最短声学距离最小的孤立词作为该语音首选识别结果。 In this embodiment, each isolated word obtained based on the shortest acoustic word model data and table data accumulated acoustic isolation distance, and then take the shortest distance from the minimum acoustic isolated word recognition result of the voice as the preferred. 语音识别采用的声学模型包括连续HMM模型和离散HMM模型。 Speech recognition using the acoustic model includes a continuous HMM and discrete HMM model model. 此外,还可以给出多个候选的识别结果让用户选择,排序的依据为最短累积声学距离。 In addition, given the recognition result can also let the user select a plurality of candidates, sorted according to cumulative acoustic shortest distance.

步骤430,根据识别出来的频道名称切换到需要切换的频道。 Step 430, the channel switch to be switched according to the identified channel name. 若存在匹配的记录项,查询结果为单个记录项时,控制机顶盒直播电视切换到记录项中频道名属性标识的频道;查询结果为多个记录时,控制电视屏幕显示多个记录项的频道名的属性值,并提示用户通过遥控器选择其中一个频道观看直播电视节目,待用户完成选择后,控制电视切换到用户选择的频道。 If there are matching entries, the query result is a single record entry, the control set-top box to switch to live TV channels in the channel name attribute record entry identifier; query result is a plurality of recording, channel name display control of the television screen of a plurality of entries attribute value, and prompts the user to select a channel to watch live television program, until the user completes selecting the control to switch the TV channel selected by the user through a remote controller.

请结合参看图5,本发明实施例频道和节目表更新方法包括如下步骤: Please Referring to Figure 5 in conjunction with, a program and a channel embodiments table update process of the invention comprises the steps of:

步骤502,检查频道和节目表是否满足更新设置条件,更新设置条件可以根据用户的需求设置,识别词表和匹配表的更新可以设置为一天。 Step 502, and the program table to check whether the channel setting condition satisfies the update, the update setting condition may be set according to the needs of the user, and update the recognition vocabulary matching table may be set to one day. 若满足更新设置件执行步骤504,否则返回步骤502。 If the member is provided to meet the update step 504 is performed, otherwise step 502.

步骤504,频道转换装置从EPG服务器下载最新的频道和节目表数据, 更新频道和节目表。 Step 504, channel switching and channel means for downloading the latest program table data, and updates the channel program table from the EPG server.

该更新的目标可以是EPG服务器,也可以是本地的网络或光盘等。 The target may be updated EPG server, and so on may be a local network or CD-ROM.

请结合参看图6,本发明实施例识别词表和匹配表更新方法包括如下步 Referring to Figure 6 in conjunction Please, identify embodiments of the present invention and matching vocabulary table update method comprising the steps

骤: Step:

步骤602,检查识别词表和匹配表是否满足更新设置条件,更新设置条件可以根据用户的需求设置,识别词表和匹配表的更新可以设置为一分钟。 Step 602, the check recognition vocabularies and match table to update the setting condition is satisfied, the update setting condition may be set according to the needs of the user, and update the recognition vocabulary matching table may be set to one minute. 若满足更新设置条件执行步骤604,否则返回步骤602。 If the setting condition satisfies the update step 604 is executed, otherwise, returns to step 602.

步骤604,根据频道和节目表更新本地的识别词表和匹配表。 Step 604, according to the channel program table and updates the local table and the recognized word match table. 本领域普通技术人员可以理解,上述方法中的全部或部分步骤可以通过程序指令相关的硬件完成,该程序可以存储在计算机可读存储介质中,该存储介质如,RAM、 ROM或光盘等。 Those of ordinary skill in the art can be appreciated, the above method steps may be all or part by a program instructing relevant hardware is completed, the program may be stored in a computer-readable storage medium, the storage medium, such as, RAM, ROM, or optical disk.

本发明实施例通过控制器接收用户的语音输入信号,通过频道转换装置根据所述输入的语音输入信号识别出待匹配名称,根据所述待匹配名称与匹配表进行匹配得出需要切换的频道,并切换到所述需要切换的频道,避免了在控制器上进行语音识别操作复杂和成本高的问题,使得用户在操作起来十分方便,并且充分利用频道转换装置的性能,节省了控制的成本。 Example user speech input signal received by the controller embodiment of the present invention, the input signal is identified by the channel name to be matched according to the voice conversion means to the input channel to be switched according to results to be matched to the matching table name match, and switching to the channel to be switched to avoid the speech recognition operation on the controller complexity and high cost, so that the user is very convenient in operation, and takes full advantage of the channel conversion device, saving the cost control. 通过频道转换装置识别出待匹配名称,不需要在网络中设置专门的语音识别服务器, 防止响应时间过长,避免了由于网络传输数据丢失的问题,并且节约了构建网络的成本。 Identified by the channel switching means to be matched name, no need to provide a dedicated voice recognition server in the network, to prevent the response time is too long, to avoid the cost of building the network as network data loss, and saves. 本发明实施例通过截取实际语音段,语音识别的准确率得到提高,并且去除了噪音的干扰。 Embodiments of the invention taken through actual speech segment, speech recognition accuracy is improved, and in addition to the interference noise. 通过静音控制单元控制语音输入时,将机顶盒静音,防止电视播放的声音对用户语音的干扰。 When the control by the voice input unit mute control, mute set-top box, television sound to prevent interference to the user's voice. 通过更新模块从EPG服务器自动更新频道和节目表,识别词表和匹配表避免了用户手工造作带来操作不便的弊端。 And automatically updates the program table from the EPG channel by updating the server module, the recognized word list and match table to avoid the inconvenience of the user operation manual artificial drawbacks.

综上所述,以上仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。 In summary, the present invention is more than merely the preferred embodiments only, not intended to limit the scope of the present invention. 凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。 Any modification within the spirit and principle of the present invention, made, equivalent substitutions, improvements, etc., should be included within the scope of the present invention.

Claims (20)

1.一种语音识别频道选择方法,其特征在于,该方法包括: 控制器接收用户的语音输入信号; 频道转换装置根据输入的语音信号及识别词表识别出待匹配名称; 根据所述待匹配名称与匹配表进行匹配得出需要切换的频道; 切换到所述需要切换的频道。 1. A speech recognition method for selecting a channel, wherein the method comprises: a controller receiving a user's speech input signal; channel conversion means matches the name to be identified and the speech signal recognition vocabulary entered; be matched according to the name matching table to match the channel to be switched, stars; switching to the channel to be switched.
2. 如权利要求1所述的语音识别频道选^^方法,其特征在于,该方法进一步包括:接收用户输入的激活语音的指令,该指令用于控制所述频道转换装置激活语音,并且将频道转换装置置为静音状态。 2. The speech recognition method according to the channel selected ^^ claimed in claim 1, wherein the method further comprises: receiving the voice activation command input by the user, the instructions for controlling said switching means to activate the voice channel, and channel switching means is set to the mute state.
3. 如权利要求1所述的语音识别频道选择方法,其特征在于,所述频道转换装置才艮据输入的语音信号识别出待匹配名称包括:采集和处理用户输入的语音信号,检测实际语音段的起点和终点,根据所述实际语音段的起点和终点识别出所述待匹配名称。 3. The channel selection method of speech recognition according to claim 1, wherein said channel switching means only recognizes the input voice signal according Gen of the name to be matched comprising: a speech signal acquisition and processing user input, detecting actual speech start and end segments, according to the actual start and end of the speech segment to be recognized match the name.
4. 如权利要求1所述的语音识别频道选择方法,其特征在于,所述频道转换装置根据输入的语音信号识别出待匹配名称包括:将语音信号进行语音特征提取;根据声学模型和识别词表计算出所述语音特征数据相对于识别词表中的词条的声学距离;若语音特征数据相对于词条的声学距离小于阈值,根据识别词表和匹配表计算出当前语音对应的频道名称。 The word acoustic models and recognition; the speech signal the speech feature extractor: 4. The channel selection method of speech recognition as claimed in claim 1, wherein said channel switching means including a name to be matched recognizes the input speech signal table is calculated with respect to the speech feature data from the acoustic recognition vocabulary entry; if the speech feature data with respect to the acoustic distance entry is less than the threshold, the voice name of the current channel calculated based on the recognition vocabulary corresponding to the matching table, and .
5. 如权利要求4所述的语音识别频道选择方法,其特征在于,该方法还包括: 若语音特征数据相对于词条的声学距离大于或等于阈值,提示用户重新输入语音。 The channel selection method of speech recognition according to claim 4, characterized in that, the method further comprising: if the voice feature data with respect to the acoustic distance term is greater than or equal to a threshold, prompt the user to input speech.
6. 如权利要求5所述的语音识别频道选择方法,其特征在于,所述提示用户重新输入语音的方式为通过电视屏幕显示用户当前输入的语音无法识别,提示用户重新输入。 Speech recognition channel selection method according to claim 6, wherein said voice prompts user to reenter mode is not recognized by the TV screen displays the current voice user input, the user is prompted to re-enter.
7. 如权利要求1所述的语音识别频道选择方法,其特征在于,该方法还进一步包括:频道转换装置向控制器发送停止语音采集控制信号,控制器根据所述停止语音采集控制信号的控制停止采集和处理语音信号。 7. The channel selection method of speech recognition according to claim 1, characterized in that, the method further comprising: channel switching means transmits a control signal to stop the voice collecting a controller which controls the voice collecting signal stop control according to the controller stop acquisition and processing voice signals.
8. 如权利要求1所述的语音识别频道选择方法,其特征在于,该方法进一步包括:频道转换装置根据电子节目指南EPG服务器更新所述匹配表和/或所述识别词表。 8. The channel selection method of speech recognition according to claim 1, wherein the method further comprises: matching said channel conversion table means and / or the recognized word table according to the electronic program guide EPG update server.
9. 如权利要求1所述的语音识别频道选择方法,其特征在于,该方法进一步包括:根据用户输入的语言选择信号,选择一个与所述语言选择信号对应的声学模型。 9. The channel selection method of speech recognition according to claim 1, characterized in that, the method further comprising: the language selection signal according to a user input selecting a selection signal corresponding to the acoustic model and the language.
10. 如权利要求1所述的语音识别频道选择方法,其特征在于,所述控制器与所述频道转换装置通过无线传输协议进行通信。 10. The channel selection method of speech recognition according to claim 1, wherein the controller and the channel conversion device communicates via a wireless transmission protocol.
11. 如权利要求IO所述的语音识别频道选择方法,其特征在于,所述无线传输协议包括:高速红外协议、蓝牙传输协议和紫蜂Zigbee传输协议中的一种或多种。 Claim 11. The IO channel selection method for speech recognition, characterized in that said wireless transmission protocol comprises: one or more high-speed infrared protocol, ZigBee and Bluetooth transmission protocol Zigbee transmission protocol.
12. —种语音识别频道选择系统,其特征在于,该系统包括:控制器,用于与频道转换处理装置进行通信;所述控制器用于接收用户的语音输入信号;所述频道转换处理装置用于根据所述输入的语音输入信号及识别词表识别出待匹配名称,根据所述待匹配名称与匹配表进行匹配得出需要切换的频道, 并切换到所述需要切换的频道。 12. - kind of channel selection speech recognition system, characterized in that, the system comprising: a controller for communicating with a channel conversion processing means; a controller for receiving a user's speech input signal; a channel conversion processing device the speech input signal to said input and recognition vocabulary to be recognized matches the name of the channel to be switched according to results to be matched to the matching table name matches, and to switch the channel to be switched.
13. 如权利要求2所述的语音识别频道选择系统,其特征在于,该系统还包括:电子节目指南EPG服务器,用于提供待更新的匹配表和/或最更新的识别词表,所述频道转换装置根据所述待更新的匹配表更新所述匹配表,和/或根据所述最新的识别词表更新所述识别词表。 13. The speech recognition system of the selected channel as claimed in claim 2, characterized in that the system further comprises: an electronic program guide EPG server for providing matching table to be updated and / or updated most recognized word list, the said channel switching means matching table, and / or update said recognition vocabulary word list according to the latest update to be based on the identification of the matching table update.
14. 一种频道转换装置,其特征在于,该装置包括: 接收模块,用于接收控制器发送的用户的语音输入信号; 识别处理模块,用于根据所述输入的语音输入信号及识别词表识别出待匹配名称;查询匹配模块,用于根据所述待匹配名称与匹配表进行匹配得出需要切换的频道;频道转换控制模块,用于切换到所述需要切换的频道。 A channel switching apparatus, wherein, the apparatus comprising: receiving means for receiving user speech input signal transmitted by the controller; recognition processing module for voice input and the input signal table according to the recognized word identified to be matched name; query matching module, configured to obtain a channel to be switched according to the table to be matched with the matching name matching; channel switching control module for switching to the channel to be switched.
15. 如权利要求14所述的频道转换装置,其特征在于,该装置还包括: 静音控制模块,用于根据用户输入的激活语音的指令,将频道转换装置置为静音状态。 Channel as claimed in claim 14, said converting means, characterized in that, the apparatus further comprising: mute control module, configured to activate the voice instruction input by the user, the channel switching means is set to the mute state.
16. 如权利要求14所述的频道转换装置,其特征在于,所述识别处理模块进一步包括:语音激活检测单元,用于检测实际语音段的起点和终点。 16. The channel conversion device according to claim 14, wherein said recognition processing module further comprises: a voice activity detection means for detecting the actual start and end points of speech segments.
17. 如权利要求14所述的频道转换装置,其特征在于,所述识别处理模块进一步包括:语音特征提取单元,用于对语音信号进行语音特征提取;语音识别单元,用于根据声学模型和识别词表计算出输入的语音特征数据相对于识别词表中词条的声学距离;语音判断单元,用于判断语音特征数据相对于词条的声学距离是否小于阈值,若语音特征数据相对于词条的声学距离小于阈值,根据识别词表和匹配表计算出当前语音对应的频道名称。 17. The channel conversion device according to claim 14, wherein said recognition processing module further comprises: a speech characteristic extraction unit for extracting speech characteristic speech signal; voice recognition unit, according to an acoustic model and vocabulary speech recognition feature data is calculated with respect to the input recognition vocabulary entries acoustic distance; voice judging means for judging whether the speech feature data with respect to the acoustic distance entry is less than the threshold, if the word speech feature data with respect to Article acoustic distance less than the threshold, the voice name of the current channel calculated based on the recognition vocabulary corresponding to and matching table.
18..如权利要求17所述的频道转换装置,其特征在于,该装置还包括: 拒绝识别提示模块,用于在识别结果为非语音时,提示用户重新输入语音。 18 .. The channel conversion device according to claim 17, wherein the apparatus further comprises: reject prompt identification module for identifying non-voice result, prompt the user to input speech.
19. 如权利要求14所述的频道转换装置,其特征在于,该装置还包括: 更新模块,用于根据电子节目指南EPG服务器更新所述匹配表和/或所述识别词表。 19. The channel conversion device according to claim 14, wherein the apparatus further comprises: an updating module, for the matching table and / or the recognized word table according to the electronic program guide EPG update server.
20. 如权利要求14所述的频道转换装置,其特征在于,该装置还包括: 语言选择模块,用于根据用户输入的语言选择信号,选择一个与所述语言选择信号对应的声学模型。 20. The channel conversion device according to claim 14, characterized in that, the apparatus further comprising: a language selection means for selecting signals based on the language input by a user, selecting a language selection signal corresponding to the acoustic model.
CNA2008100654170A 2008-02-23 2008-02-23 Speech recognition channel selecting system, method and channel switching device CN101516005A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2008100654170A CN101516005A (en) 2008-02-23 2008-02-23 Speech recognition channel selecting system, method and channel switching device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CNA2008100654170A CN101516005A (en) 2008-02-23 2008-02-23 Speech recognition channel selecting system, method and channel switching device
PCT/CN2009/070380 WO2009103226A1 (en) 2008-02-23 2009-02-09 A voice recognition channel selection system, a voice recognition channel selection method and a channel switching device

Publications (1)

Publication Number Publication Date
CN101516005A true CN101516005A (en) 2009-08-26

Family

ID=40985065

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2008100654170A CN101516005A (en) 2008-02-23 2008-02-23 Speech recognition channel selecting system, method and channel switching device

Country Status (2)

Country Link
CN (1) CN101516005A (en)
WO (1) WO2009103226A1 (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102546034A (en) * 2012-02-07 2012-07-04 深圳市纽格力科技有限公司 Method and equipment for processing voice signals
CN102789176A (en) * 2012-07-04 2012-11-21 北京捷通华声语音技术有限公司 Control method and system for household appliance terminal
CN102833634A (en) * 2012-09-12 2012-12-19 康佳集团股份有限公司 Implementation method for television speech recognition function and television
CN102999161A (en) * 2012-11-13 2013-03-27 安徽科大讯飞信息科技股份有限公司 Implementation method and application of voice awakening module
CN103209369A (en) * 2012-01-16 2013-07-17 晨星软件研发(深圳)有限公司 Voice-controlled system of electronic device and related control method
CN103297725A (en) * 2012-02-28 2013-09-11 联想(北京)有限公司 Method and device for controlling electronic equipment and remote control
CN103366743A (en) * 2012-03-30 2013-10-23 北京千橡网景科技发展有限公司 Voice-command operation method and device
CN103366740A (en) * 2012-03-27 2013-10-23 联想(北京)有限公司 Voice command recognition method and voice command recognition device
CN103458287A (en) * 2013-09-02 2013-12-18 四川长虹电器股份有限公司 System and method for game voice control based on digital television remote control technology
CN103491411A (en) * 2013-09-26 2014-01-01 深圳Tcl新技术有限公司 Method and device based on language recommending channels
CN103489447A (en) * 2012-06-13 2014-01-01 华为技术有限公司 Voice input method of remote controller, remote controller and multimedia terminal system
CN103581724A (en) * 2012-08-09 2014-02-12 纬创资通股份有限公司 Control method and video-audio playing system
CN103607609A (en) * 2013-11-27 2014-02-26 Tcl集团股份有限公司 Voice switching method and device for TV set channels
CN103634644A (en) * 2013-12-09 2014-03-12 乐视致新电子科技(天津)有限公司 Method and system for switching channels of smart television through voices
CN103824559A (en) * 2012-11-19 2014-05-28 国际商业机器公司 Interleaving voice commands for electronic meetings
CN103916685A (en) * 2013-01-08 2014-07-09 联想(北京)有限公司 Method and device for changing television channels and television set
CN104363517A (en) * 2014-11-12 2015-02-18 科大讯飞股份有限公司 Speech switching method and system based on television scene and speech assistant
CN104461446A (en) * 2014-11-12 2015-03-25 科大讯飞股份有限公司 Software running method and system based on voice interaction
CN104506944A (en) * 2014-11-12 2015-04-08 科大讯飞股份有限公司 Voice interaction assisting method and system based on television scene and voice assistant
CN104766608A (en) * 2014-01-07 2015-07-08 深圳市中兴微电子技术有限公司 Voice control method and voice control device
WO2015135300A1 (en) * 2014-03-14 2015-09-17 京东方科技集团股份有限公司 Method for controlling tv set through voice, and tv set
CN105573709A (en) * 2014-10-10 2016-05-11 讯飞智元信息科技有限公司 Voice input equipment control method and system
CN105847900A (en) * 2016-05-26 2016-08-10 无锡天脉聚源传媒科技有限公司 Method and device for determining program channel
WO2017035844A1 (en) * 2015-09-06 2017-03-09 何兰 Information prompting method for use when matching voice to channel group and remote control system
WO2017035845A1 (en) * 2015-09-06 2017-03-09 何兰 Method and remote control system for invoking channel grouping according to voice

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140055502A (en) * 2012-10-31 2014-05-09 삼성전자주식회사 Broadcast receiving apparatus, server and control method thereof
CN102938864A (en) * 2012-11-27 2013-02-20 四川长虹电器股份有限公司 Method for realizing television channel switching based on customized voice

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20000042731A (en) * 1998-12-26 2000-07-15 전주범 Channel switching apparatus based on voice recognition of television
US6314398B1 (en) * 1999-03-01 2001-11-06 Matsushita Electric Industrial Co., Ltd. Apparatus and method using speech understanding for automatic channel selection in interactive television
CN2518278Y (en) * 2001-12-31 2002-10-23 海尔集团公司 Acoustic controlled telephone remote controller
CN2681491Y (en) * 2003-01-22 2005-02-23 程松林 Voice demander for television
CN2657310Y (en) * 2003-12-02 2004-11-17 肖奇 Sound controlled TV set
CN100538762C (en) * 2006-12-15 2009-09-09 广东协联科贸发展有限公司 Keying speech integrated remote controller

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103209369A (en) * 2012-01-16 2013-07-17 晨星软件研发(深圳)有限公司 Voice-controlled system of electronic device and related control method
CN102546034B (en) 2012-02-07 2013-12-18 深圳市纽格力科技有限公司 Method and equipment for processing voice signals
CN102546034A (en) * 2012-02-07 2012-07-04 深圳市纽格力科技有限公司 Method and equipment for processing voice signals
CN103297725A (en) * 2012-02-28 2013-09-11 联想(北京)有限公司 Method and device for controlling electronic equipment and remote control
CN103366740B (en) * 2012-03-27 2016-12-14 联想(北京)有限公司 Voice command identification method and device
CN103366740A (en) * 2012-03-27 2013-10-23 联想(北京)有限公司 Voice command recognition method and voice command recognition device
CN103366743A (en) * 2012-03-30 2013-10-23 北京千橡网景科技发展有限公司 Voice-command operation method and device
CN103489447A (en) * 2012-06-13 2014-01-01 华为技术有限公司 Voice input method of remote controller, remote controller and multimedia terminal system
CN102789176B (en) * 2012-07-04 2015-08-05 北京捷通华声语音技术有限公司 A kind of household electrical appliance terminal control method and system
CN102789176A (en) * 2012-07-04 2012-11-21 北京捷通华声语音技术有限公司 Control method and system for household appliance terminal
CN103581724A (en) * 2012-08-09 2014-02-12 纬创资通股份有限公司 Control method and video-audio playing system
CN102833634A (en) * 2012-09-12 2012-12-19 康佳集团股份有限公司 Implementation method for television speech recognition function and television
CN102999161A (en) * 2012-11-13 2013-03-27 安徽科大讯飞信息科技股份有限公司 Implementation method and application of voice awakening module
CN102999161B (en) * 2012-11-13 2016-03-02 科大讯飞股份有限公司 A kind of implementation method of voice wake-up module and application
CN103824559B (en) * 2012-11-19 2017-06-06 国际商业机器公司 Insert the voice command for electronic meeting
CN103824559A (en) * 2012-11-19 2014-05-28 国际商业机器公司 Interleaving voice commands for electronic meetings
CN103916685A (en) * 2013-01-08 2014-07-09 联想(北京)有限公司 Method and device for changing television channels and television set
CN103916685B (en) * 2013-01-08 2017-11-03 联想(北京)有限公司 A kind of television channel replacing options, device and television set
CN103458287A (en) * 2013-09-02 2013-12-18 四川长虹电器股份有限公司 System and method for game voice control based on digital television remote control technology
CN103491411A (en) * 2013-09-26 2014-01-01 深圳Tcl新技术有限公司 Method and device based on language recommending channels
CN103607609B (en) * 2013-11-27 2017-09-05 Tcl集团股份有限公司 The method for switching languages and device of a kind of TV channel
CN103607609A (en) * 2013-11-27 2014-02-26 Tcl集团股份有限公司 Voice switching method and device for TV set channels
CN103634644A (en) * 2013-12-09 2014-03-12 乐视致新电子科技(天津)有限公司 Method and system for switching channels of smart television through voices
CN104766608A (en) * 2014-01-07 2015-07-08 深圳市中兴微电子技术有限公司 Voice control method and voice control device
WO2015135300A1 (en) * 2014-03-14 2015-09-17 京东方科技集团股份有限公司 Method for controlling tv set through voice, and tv set
CN105573709A (en) * 2014-10-10 2016-05-11 讯飞智元信息科技有限公司 Voice input equipment control method and system
CN104363517A (en) * 2014-11-12 2015-02-18 科大讯飞股份有限公司 Speech switching method and system based on television scene and speech assistant
CN104363517B (en) * 2014-11-12 2018-05-11 科大讯飞股份有限公司 Method for switching languages and system based on tv scene and voice assistant
CN104461446B (en) * 2014-11-12 2018-05-18 科大讯飞股份有限公司 Software running method and system based on interactive voice
CN104506944A (en) * 2014-11-12 2015-04-08 科大讯飞股份有限公司 Voice interaction assisting method and system based on television scene and voice assistant
CN104461446A (en) * 2014-11-12 2015-03-25 科大讯飞股份有限公司 Software running method and system based on voice interaction
WO2017035845A1 (en) * 2015-09-06 2017-03-09 何兰 Method and remote control system for invoking channel grouping according to voice
WO2017035844A1 (en) * 2015-09-06 2017-03-09 何兰 Information prompting method for use when matching voice to channel group and remote control system
CN105847900A (en) * 2016-05-26 2016-08-10 无锡天脉聚源传媒科技有限公司 Method and device for determining program channel
CN105847900B (en) * 2016-05-26 2018-10-26 无锡天脉聚源传媒科技有限公司 A kind of program channel determines method and device

Also Published As

Publication number Publication date
WO2009103226A1 (en) 2009-08-27

Similar Documents

Publication Publication Date Title
CN1645477B (en) Automatic speech recognition learning using user corrections
CA2387079C (en) Natural language interface control system
CN105765650B (en) With multidirectional decoded voice recognition
EP1199708B1 (en) Noise robust pattern recognition
US9117449B2 (en) Embedded system for construction of small footprint speech recognition with user-definable constraints
US10475445B1 (en) Methods and devices for selectively ignoring captured audio data
US9495956B2 (en) Dealing with switch latency in speech recognition
KR100383353B1 (en) Speech recognition apparatus and method of generating vocabulary for the same
KR101622111B1 (en) Dialog system and conversational method thereof
EP2556652B1 (en) System and method of smart audio logging for mobile devices
US9224394B2 (en) Service oriented speech recognition for in-vehicle automated interaction and in-vehicle user interfaces requiring minimal cognitive driver processing for same
US8949266B2 (en) Multiple web-based content category searching in mobile search application
US8898065B2 (en) Configurable speech recognition system using multiple recognizers
US9858927B2 (en) Processing spoken commands to control distributed audio outputs
US20080004877A1 (en) Method, Apparatus and Computer Program Product for Providing Adaptive Language Model Scaling
TWI253056B (en) Combined engine system and method for voice recognition
JP4558074B2 (en) Telephone communication terminal
US8510103B2 (en) System and method for voice recognition
CN1196324C (en) A voice controlled remote control with downloadable set of voice commands
US9330667B2 (en) Method and system for endpoint automatic detection of audio record
EP0077194B1 (en) Speech recognition system
JP2019514045A (en) Speaker verification method and system
US8775181B2 (en) Mobile speech-to-speech interpretation system
JP3968133B2 (en) Speech recognition dialogue processing method and speech recognition dialogue apparatus
US20110054895A1 (en) Utilizing user transmitted text to improve language model in mobile dictation application

Legal Events

Date Code Title Description
PB01 Publication
C06 Publication
SE01 Entry into force of request for substantive examination
C10 Entry into substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20090826

C12 Rejection of a patent application after its publication