CN101237520A

CN101237520A - A system and method for voice control STB

Info

Publication number: CN101237520A
Application number: CNA2008100826783A
Authority: CN
Inventors: 季松涛; 宋国栋; 唐纬
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2008-02-22
Filing date: 2008-02-22
Publication date: 2008-08-06

Abstract

The invention provides a system and a method for controlling a set-top box by voice, relating to the digital television set-top box and the IPTV set-top box fields. The system of the invention comprises a voice control module. In a voice recording mode, the voice control module inputs voice recording of a user, processes the voice recording and cods and saves the processed voice recording of the user; in a voice control mode, the voice control module receives an operation command from the user, compares the operation command with the saved voice recording of the user to find out an address corresponding to a matching recording when the matching recording is found, and sends a command for changing channel to a set-top box channel control circuit according to a channel corresponding to the address. The invention comprises a voice inputting step and a voice controlling step. The system of the voice control set-top box and a method thereof ensures that disable users and other users with work in hands can control the set-top box by voice, thereby operating the set-top box more conveniently and increasing the satisfaction of users.

Description

A kind of system and method for voice control STB

Technical field

The present invention relates to digital TV set-top box and IPTV field of set-top.

Background technology

Along with the popularization of Digital Television and IPTV, set-top box will become imperative equipment in the family.At present, TV programme is more and more abundanter, and channel is more and more, sees that TV will long-term big event as home entertaining.

Operation to set-top box at present mainly realizes by manual controller.Though remote controller has easy to learn, easy to operate advantage, under a lot of special occasions, it is defectiveness also.

For the user of upper limb disability, the operation manual controller is difficulty relatively, even the abled person if also doing other things when seeing TV, transfer platform control, just must free hand to, and the button on remote controller or the set-top box, also inconvenient.

Summary of the invention

The system and method that the purpose of this invention is to provide a kind of voice control STB, controller top box under the prerequisite of other work in not influencing user's hand.

For realizing foregoing invention, the invention provides a kind of voice-operated set-top-box system, this system comprises speech control module, when set-top box is in voice typing pattern, is used to import the user speech recording, handles described user speech recording, the described user speech recording numbering preservation after will handling; When set-top box is in the voice control model, be used to receive the user speech operational order, handle described user speech operational order, described user speech operational order and the described user speech recording of having preserved are compared, when finding the coupling recording, be used to read coupling recording corresponding address, and change the platform order to the transmission of set-top box channel control circuit according to the channel of described corresponding address correspondence.

Described speech control module further comprises:

The high-gain microphone is used to gather user voice signal and is transferred to the analog baseband signal processor;

The analog baseband signal processor is used to handle described user speech recording and the instruction of described user speech, make it become digital signal, and the described digital signal that will obtain sends to the digital baseband signal processor;

The digital baseband signal processor, the described user speech recording data number that is used for to receive, and be saved in the memory, make data corresponding with channel, the described user speech instruction that also is used for receiving is compared with the described user speech recording data of memory, if find matched data, change the platform instruction to the transmission of set-top box channel control circuit according to its pairing channel; And

Memory is used to preserve described user speech recording, makes it corresponding with certain channel.

Further, described set-top-box system also comprises: the typing button is used for the controller top box and enters described voice typing pattern; Control button is used for the controller top box and enters described voice control model.

Further, described user speech recording of described analog baseband signal processor processing and the instruction of described user speech comprise amplification, sampling, filtering.

Further, described speech control module also comprises RC filtering net, is used for the weak electric signal with described high-gain microphone transmission, carries out filtering by described RC filter network, and filtered voice signal is sent to described analog baseband signal processor.

Further, described analog baseband signal processor comprises: audio frequency amplifier, analog/digital a-d converter, digital filter.

Combine with said system, the present invention also provides a kind of method of voice control STB, comprises the steps:

Voice typing step is recorded by speech control module typing user speech, and phonetic modification is the digital signal storage, and is corresponding with channel;

The voice controlled step, receive the user speech operational order by speech control module, and be digital signal with described voice operating instruction map, with the user speech recording digital signal contrast of having stored, if matched signal is arranged, then the channel according to the matched signal correspondence changes the platform instruction to the transmission of set-top box channel control circuit.

Further, in the described voice controlled step, if there is not matched signal, process ends then.

Further, described voice typing step comprises:

S01: the high-gain microphone is converted to weak electric signal with described user speech recorded audio signals;

The S02:RC filter network carries out filtering to described weak electric signal, and sends to the analog baseband signal processor;

S03: the built-in audio amplifier circuit of described analog baseband signal processor amplifies described weak electric signal, and sends to a-d converter;

S04: described a-d converter is a digital audio and video signals with analog signal conversion, and carries out the sampling rate conversion, and the high-pass filtering conversion sends to the digital baseband signal processor;

S05: described digital baseband signal processor numbers described digital audio and video signals, deposit memory in, makes the optional network specific digit audio signal corresponding with specific channel.

Further, described voice controlled step comprises:

S11: the high-gain microphone is converted to weak electric signal with described user speech operational order;

The S12:RC filter network carries out filtering to described weak electric signal, and sends to the analog baseband signal processor;

S13: the built-in audio amplifier circuit of described analog baseband signal processor amplifies described weak electric signal, and sends to a-d converter;

S14: described a-d converter is a digital audio and video signals with analog signal conversion, and carries out the sampling rate conversion, and the high-pass filtering conversion sends to the digital baseband signal processor;

S15: described digital baseband signal processor compares the recording of user speech described in digital audio and video signals and memory digital audio and video signals, if coupling, then the channel according to the matched signal correspondence changes the platform instruction to the transmission of set-top box channel control circuit.

The present invention for the user provides the method for controller top box more easily, makes the user that other work are arranged in disabled user and the hand can pass through voice control STB by technique scheme, has increased user satisfaction.

Description of drawings

Fig. 1 is a system principle structure chart of the present invention;

Fig. 2 is the inventive method user speech typing flow chart;

Fig. 3 is the inventive method user speech control flow chart.

Embodiment

Below in conjunction with the drawings and specific embodiments the present invention is described in more detail.

Main thought of the present invention is by increase speech control module on the basis of existing set-top box, by this module receive and the recording of storage user's voice and with specific channel corresponding stored, when the user sends phonetic control command, by comparing, find out matched data and carry out corresponding operating with the storage data.

As shown in Figure 1, be the internal structure of speech control module, this module comprises high-gain microphone, RC filter network, analog baseband signal processor, digital baseband signal processor, the memory that links to each other successively.

Wherein, the high-gain microphone is called built-in transmitter again, and built-in transmitter is converted into weak electric signal with voice signal, carry out filtering by the RC filter network after, directly insert the analog baseband signal processor.

Analog baseband signal processor inside comprises: audio frequency amplifier, a-d converter, digital filter etc.After analog voice signal is imported into, be sent to audio amplifier circuit earlier, after amplifying, deliver to a-d converter, analog signal conversion is become digital voice signal, digital voice signal can be delivered to the interface with the digital baseband signal processor behind sampling rate converter unit and digital high-pass filter.

The digital baseband signal processor, double controller and the digital signal processor done of digital baseband signal processor.The digital baseband signal processor is responsible for voice digital signal is carried out with concrete operational order corresponding one by one, and under voice typing pattern, it deposits digital audio and video signals in memory; Under the voice control model, the data that keep in advance in voice signal and the memory are compared coupling, transmit operation instruction.

Memory is used for storing pre-set phonetic order.

As shown in Figure 2, be the inventive method user speech typing flow chart.

When the user presses the typing button, set-top box is in voice typing pattern.If this moment, television channel just in time switched to one in central authorities, point out " please import channel selection order voice " on the video screen this moment, the user we can say " one " or " one in central authorities ", after user's voice is imported by microphone, after the analog baseband signal processor processing, under the control signal that the digital baseband signal processor sends, deliver to the digital baseband signal processor, processor is by certain speech characteristic parameter extractive technique, the characteristic parameter extraction of voice is come out, store in the memory and go in " one in central authorities " corresponding address.After the phonetic order typing is finished, turn off the typing button.The speech characteristic parameter extractive technique is a prior art, and linear prediction (LP) analytical technology is present widely used Technique of Feature Extraction, and many successful application systems all adopt the cepstrum parameter that extracts based on the LP technology.But linear prediction model is the pure mathematics model, does not consider the processing feature of human auditory system to voice.Mel parameter and the perception linear prediction cepstrum that extracts based on perception linear prediction (PLP) analysis have been simulated the processing feature of people's ear to voice to a certain extent, have used some achievements in research of human auditory system perception aspect.Experiment showed, and adopt this technology, the performance of speech recognition system improves.

Wherein, the analogsimulation baseband signal processor is as follows to the Signal Processing process: the built-in audio amplifier circuit of analog baseband signal processor amplifies the weak electric signal that receives, and sends to a-d converter; A-d converter is a digital audio and video signals with analog signal conversion, and carries out the sampling rate conversion, the high-pass filtering conversion, thus finish the Signal Processing process.

As shown in Figure 3, open control button, set-top box enters the controlled pattern of voice.If set-top box receives the voice with previous typing coupling, so just carry out operating accordingly with these voice.For example, television channel is in central authorities' 4 covers, and this moment, the user said " one " " central authorities one " (this depend on previous user's typing what is) in other words, and set-top box will automatically switch to one in central authorities.User's voice is input to set-top box by microphone, (processing procedure during with above-mentioned voice typing is consistent after the processing through the analog baseband signal processor, repeat no more), become audio digital signals, after delivering to the digital baseband signal processor, analyze extractive technique by the Mel parameter with based on perception linear prediction (PLP), the characteristic parameter of voice is extracted out, compare successively with the data of storing under the typing pattern, if find the data of coupling, so just, send channel information to channel control circuit according to the channel of the address correspondence at data place.If the voice that receive are not just operated with the not coupling of previous typing.

Certainly; the present invention also can have other various embodiments; under the situation that does not deviate from spirit of the present invention and essence thereof; those of ordinary skill in the art work as can make various corresponding changes and distortion according to the present invention, but these corresponding changes and distortion all should belong to the protection range of the appended claim of the present invention.

Claims

1, a kind of voice-operated set-top-box system, it is characterized in that: comprise speech control module, when set-top box is in voice typing pattern, is used to import the user speech recording, handles described user speech recording, the described user speech recording numbering preservation after will handling; When set-top box is in the voice control model, be used to receive the user speech operational order, handle described user speech operational order, described user speech operational order and the described user speech recording of having preserved are compared, when finding the coupling recording, be used to read coupling recording corresponding address, and change the platform order to the transmission of set-top box channel control circuit according to the channel of described corresponding address correspondence.

2, set-top-box system as claimed in claim 1 is characterized in that, described speech control module comprises:

3, set-top-box system as claimed in claim 2 is characterized in that, described set-top-box system also comprises: the typing button is used for the controller top box and enters described voice typing pattern; Control button is used for the controller top box and enters described voice control model.

4, set-top-box system as claimed in claim 2 is characterized in that, described user speech recording of described analog baseband signal processor processing and the instruction of described user speech comprise amplification, sampling, filtering.

5, set-top-box system as claimed in claim 2, it is characterized in that, described speech control module also comprises RC filtering net, be used for weak electric signal with described high-gain microphone transmission, carry out filtering by described RC filter network, filtered voice signal is sent to described analog baseband signal processor.

6, set-top-box system as claimed in claim 2 is characterized in that, described analog baseband signal processor comprises: audio frequency amplifier, analog/digital a-d converter, digital filter.

7, a kind of method of voice control STB is characterized in that, comprises the steps:

8, method as claimed in claim 7 is characterized in that, in the described voice controlled step, if there is not matched signal, and process ends then.

9, method as claimed in claim 7 is characterized in that,

Described voice typing step comprises the steps:

S901: the high-gain microphone is converted to weak electric signal with described user speech recorded audio signals;

The S902:RC filter network carries out filtering to described weak electric signal, and sends to the analog baseband signal processor;

S903: the built-in audio amplifier circuit of described analog baseband signal processor amplifies described weak electric signal, and sends to a-d converter;

S904: described a-d converter is a digital audio and video signals with analog signal conversion, and carries out the sampling rate conversion, and the high-pass filtering conversion sends to the digital baseband signal processor;

S905: described digital baseband signal processor numbers described digital audio and video signals, deposit memory in, makes the optional network specific digit audio signal corresponding with specific channel.

10, as claim 7 or 9 described methods, it is characterized in that,

Described voice controlled step comprises the steps:

S1001: the high-gain microphone is converted to weak electric signal with described user speech operational order;

The S1002:RC filter network carries out filtering to described weak electric signal, and sends to the analog baseband signal processor;

S1003: the built-in audio amplifier circuit of described analog baseband signal processor amplifies described weak electric signal, and sends to a-d converter;

S1004: described a-d converter is a digital audio and video signals with analog signal conversion, and carries out the sampling rate conversion, and the high-pass filtering conversion sends to the digital baseband signal processor;

S1005: described digital baseband signal processor compares the recording of user speech described in digital audio and video signals and memory digital audio and video signals, if coupling, then the channel according to the matched signal correspondence changes the platform instruction to the transmission of set-top box channel control circuit.