CN101557432B

CN101557432B - Mobile terminal and menu control method thereof

Info

Publication number: CN101557432B
Application number: CN2008101279100A
Authority: CN
Inventors: 尹种根; 郑大成; 杻在勋; 金兑俊; 赵在珉; 郭宰到; 申宗壕
Original assignee: LG Electronics Inc
Current assignee: LG Electronics Inc
Priority date: 2008-04-08
Filing date: 2008-07-02
Publication date: 2013-06-19
Anticipated expiration: 2028-07-02
Also published as: KR20090107364A; CN101557432A

Abstract

A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal and a memory configured to store multiple domains related to menus and operations of the mobile terminal. It further includes a controller configured to access a specific domain among the multiple domains included in the memory based on the received input to activate the voice recognition function, to recognize user speech based on a language model and an acoustic model of the accessed domain, and to determine at least one menu and operation of the mobile terminal based on the accessed specific domain and the recognized user speech.

Description

Mobile terminal and menu control method thereof

Background of invention

1. technical field

The present invention relates to mobile terminal, and can correspondingly be by the territory that is used for speech recognition is arranged to certain menu or is served the method that relevant information improves phonetic recognization rate.

2. background technology is described

Except basic session services, mobile terminal also provides a lot of Additional Services now.For example, the present the Internet accessible of user, play games, watch video, listen to the music, catch image and video, record audio file etc.Mobile terminal also provides broadcast program now, makes the user can watch TV programme, sports cast, video etc.

In addition, because the function that mobile terminal comprises significantly increases, so that user interface has also become is more complicated.For example, user interface comprises the touch-screen that makes the user can touch and select concrete item or menu option now.Mobile terminal also comprises the very limited speech identifying function that makes the user can carry out preliminary function.Yet the error rate when determining the implication of user speech instruction is very high, so the user does not generally use the limited speech identifying function parts on terminal.

Summary of the invention

Therefore, an object of the present invention is to solve above-indicated problem and other problem.

Another object of the present invention is to provide a kind of mobile terminal, and accordingly by controlling with its specific function based on the implication of background and content recognition voice command or serving the method for relevant menu.

Another purpose of the present invention is to provide a kind of mobile terminal and corresponding by the territory that is used for speech recognition is appointed as with certain menu or is served the method that relevant territory significantly improves phonetic recognization rate.

A further object of the present invention is to provide a kind of mobile terminal and corresponding handle to control with specific function or serve the method for relevant menu in order to detect the user by use its one or more user interfaces (UI) in voice activated recognition function.

Another purpose of the present invention is to provide a kind of mobile terminal and corresponding method, be used for by the help information about the input of voice command is provided according to the mode of operation of mobile terminal or operator scheme, even by novice user via his or her voice command, control with specific function or serve relevant menu.

In order to realize these or other advantage and according to purpose of the present invention, embody with broadly described as this paper, a kind of mobile terminal is provided in one aspect, has comprised: input unit, it is configured to receive the input for activating the speech identifying function on mobile terminal; Memory, it is configured to store a plurality of territories relevant with operation with the menu of mobile terminal; And controller, it is configured to be included in special domain in a plurality of territories of this memory based on the input reference that is used for voice activated recognition function that receives, with language model and the acoustics Model Identification user speech based on the territory of being accessed, and determine at least one menu and the operation of mobile terminal based on the special domain of accessing and the user speech identified.

On the other hand, the invention provides a kind of method of controlling mobile terminal.The method comprises: receive the input that is used for activating the speech identifying function on mobile terminal; Based on the input that is used for voice activated recognition function that receives, access is included in the special domain in a plurality of territories of storing in the memory of mobile terminal; Language model and acoustics Model Identification user speech based on the territory of accessing; And based at least one menu and the operation of the special domain of accessing with the user speech output mobile terminal of identifying.

Become apparent in the detailed description that the further scope of applicability of the present invention will provide hereinafter.Yet, be to be understood that, although detailed description and specific examples have been indicated preferred embodiment of the present invention but have only been provided as an illustration, because variations and modifications within the spirit and scope of the present invention are apparent after reading detailed description to one skilled in the art.

Brief Description Of Drawings

With comprehend the present invention, these the detailed description and the accompanying drawings only provide as an illustration from the detailed description and the accompanying drawings that hereinafter provide, are not therefore limitations of the present invention, in the accompanying drawings:

Fig. 1 is the block diagram of mobile terminal according to an embodiment of the invention;

Fig. 2 is the front perspective view of mobile terminal according to an embodiment of the invention;

Fig. 3 is the rear side stereogram of the mobile terminal shown in Fig. 2;

Fig. 4 is the general survey of the communication system that can operate on mobile terminal of the present invention;

Fig. 5 illustrates the flow chart that passes through the mobile teminal menu control method of voice command according to an embodiment of the invention;

Fig. 6 A is the general survey of method that the speech identifying function of activation mobile terminal according to an embodiment of the invention is shown;

Fig. 6 B and 6C are the general surveys of method that the help information of output mobile terminal according to an embodiment of the invention is shown;

Fig. 7 A is the flow chart of method that the voice command of identification mobile terminal according to an embodiment of the invention is shown;

Fig. 7 B is the general survey of method that the voice command of identification mobile terminal according to an embodiment of the invention is shown;

Fig. 8 is the general survey of method that the menu of the phonetic recognization rate for showing mobile terminal according to an embodiment of the invention is shown;

Fig. 9 is the general survey of method that the voice command of identification mobile terminal according to another embodiment of the invention is shown;

Figure 10 is the general survey of database configuration of the benchmark of the voice command identification as mobile terminal according to an embodiment of the invention;

Figure 11 is the general survey that the state that the speech identifying function of mobile terminal according to an embodiment of the invention just is being performed is shown;

Figure 12 illustrates the general survey of processing the method for the subcommand relevant with certain menu in mobile terminal by voice command according to an embodiment of the invention;

Figure 13 illustrates general survey of searching for the method for subway maps in mobile terminal by voice command according to an embodiment of the invention;

Figure 14 illustrates the general survey of passing through the method for voice command multimedia rendering file in mobile terminal according to an embodiment of the invention;

Figure 15 illustrates the general survey that sends the method for Email in mobile terminal by voice command according to an embodiment of the invention;

Figure 16 illustrates general survey of carrying out the method for call in mobile terminal by voice command according to an embodiment of the invention;

Figure 17 illustrates the general survey of using the method for phone book information in mobile terminal by voice command according to an embodiment of the invention;

Figure 18 illustrates the general survey that changes the method for rear projection screen in mobile terminal by voice command according to an embodiment of the invention;

Figure 19 illustrates the general survey of passing through the method for voice command multimedia rendering file in mobile terminal according to an embodiment of the invention.

Embodiment

The below will be in detail with reference to better embodiment of the present invention, and its concrete exemplary plot is shown in the drawings.

Fig. 1 is the block diagram of mobile terminal 100 according to an embodiment of the invention.As shown in the figure, mobile terminal 100 comprises wireless communication unit 110, and this wireless communication unit 110 has the one or more assemblies that allow to carry out radio communication between the wireless communication system at mobile terminal 100 and this mobile terminal place or network.

For example, wireless communication unit 110 comprises via the broadcast reception module 111 of broadcasting channel from external broadcasting management entity receiving broadcast signal and/or broadcasting related information.Broadcasting channel can comprise satellite channel and ground channel.

In addition, the broadcast control entity typically refers to the system that sends broadcast singal and/or broadcasting related information.The example of broadcasting related information comprises the information that is associated with broadcasting channel, broadcast program, broadcast service provider etc.For example, the broadcasting related information can comprise the electronic program guides (EPG) of DMB (DMB) and the electronic service guidebooks (ESG) of hand-held digital video broadcast (DVB-H).

In addition, broadcast singal can be implemented as TV broadcast singal, radio signals and data broadcasting signal etc.Broadcast singal also can comprise the broadcast singal with TV or radio signals combination.

Broadcast reception module 111 also is configured to receive the broadcast singal that sends from all kinds broadcast system.For example, this broadcast system comprise T-DMB (DMB-T), digital multimedia broadcast (dmb) via satellite (DMB-S), hand-held digital video broadcast (DVB-H) system, be called the single forward link of media (

) Radio Data System and floor synthetic service digital broadcasting (ISDB-T) etc.It is also possible receiving multicast signals.In addition, the data that received by broadcast reception module 111 can be stored in suitable equipment such as memory 160.

Wireless communication unit 110 also comprises mobile communication module 112, and it sends wireless signal or receive wireless signal from it to one or more network entities (for example base station, node-b).These signals can represent audio frequency, video, multimedia, control signal and data etc.

What also comprise is wireless Internet module 113, and it supports the internet access of mobile terminal.This module 113 can be coupled on terminal internal or externally.Wireless communication unit 110 also comprises short-range communication module 114, and it helps relatively short-range communication.The appropriate technology of realizing this module comprises radio frequency identification (RFID), infra red data as-sodation (IrDA) and the ultra broadband (UWB) that for example is commonly referred to bluetooth and ZigBee in network technology, slightly lifts sincerely several examples.

Locating module 115 also is included in wireless communication unit 110, and identifies or otherwise obtain the position of mobile terminal 100.This locating module 115 can be realized with global positioning system (GPS) assembly that cooperates with the satellite that is associated, networking component and combination thereof.

In addition, as shown in Figure 1, mobile terminal 100 also comprises audio/video (A/V) input unit 120, and it provides the audio or video signal to mobile terminal 100.As shown in the figure, A/V input unit 120 comprises camera 121 and microphone 122.Camera 121 receives and processes the picture frame of still picture or video.

In addition, when portable set was under AD HOC such as phone call mode, logging mode and speech recognition mode, microphone 122 received external audio signals.The audio signal that receives is then processed and convert numerical data to.Equally, this portable set, especially the A/V input unit 120, generally include for removing to mix the noise remove algorithm what receive noise that the external audio signal process generates.In addition, the data that generated by A/V input unit 120 can be stored in memory 160, used or sent via one or more modules of communication unit 110 by output unit 150.If necessary, can use two or more microphones and/or camera.

Mobile terminal 100 also comprises user input unit 130, and it generates the input data in response to the user to the manipulation of one or more related input equipments.The example of this equipment comprises keyboard, key switch, touch pad (for example static pressure/electric capacity), moving runner and rotating switch.Concrete example is the terminal that user input unit 130 is configured to the touch pad that cooperates with touch-screen display, and this will be in following more detailed description.

Sensing cell 140 also is included in mobile terminal 100, and the state measurement to the various aspects of mobile terminal 100 is provided.For example, whether sensing cell 140 change in location, the user of assembly that can detect relative positioning, mobile terminal 100 or the mobile terminal 100 of the open/close state of mobile terminal 100, the assembly of mobile terminal 100 (such as display and keypad) contacts with mobile terminal 100, the orientation of mobile terminal 100 or acceleration/deceleration etc.

As example, when mobile terminal 100 is slide type mobile terminal, but the slipper of sensing cell 140 sensing movement terminals 100 is opened or is closed.Other example comprises whether sensing cell 140 sensing power supplys 190 provide whether have coupling or other connection between power, interface unit 170 and external equipment.

In addition, interface unit 170 often is embodied as mobile terminal and external equipment coupling.Typical external equipment comprises wire/wireless headpiece, external charger, power supply, is used for memory device, earphone and the microphone etc. of storage data (such as audio frequency, video, picture etc.).In addition, interface unit 170 can be used wire/wireless FPDP, card slot (for example, being used for being coupled to memory card, client identification module (SIM) card, subscriber identification module (UIM) card, Removable User Identity Module (RUIM) card etc.), audio frequency input/output end port and video input/output port.

Output unit 150 generally includes the various assemblies of supporting that mobile terminal 100 outputs require.Mobile terminal 100 also comprises display 151, and it shows the information that is associated with mobile terminal 100 with visual means.For example, if mobile terminal 100 runs on phone call mode, display 151 usually provides and comprises and user interface or the graphic user interface of breathing out, carrying out calling out with termination telephone the information that is associated.As another example, if mobile terminal 100 is under video call pattern or Photographing Mode, display 151 can be additionally or is alternatively shown the image that is associated with these patterns.

In addition, display 151 preferably also comprises the touch-screen with input equipment collaborative work such as touch pad.This configuration allows display 151 to serve as simultaneously output equipment and input equipment.In addition, display 151 can be with comprising that for example the Display Technique of liquid crystal display (LCD), Thin Film Transistor-LCD (TFT-LCD), organic light emitting diode display (OLED), flexible display and three dimensional display realizes.

Mobile terminal 100 also can comprise one or more such displays.The example of dual screen embodiment is that a display is configured to internal display (can check) and second display and is configured to external display (can check opening and closing the position) when terminal is shown in an open position.

Fig. 1 also illustrates has the output unit 150 that the audio frequency of supporting mobile terminal 100 is exported the audio frequency output module 152 that needs.Audio frequency output module 152 is realized with one or more loud speakers, buzzer, other audio producing device and combination thereof usually.In addition, audio frequency output module 152 can move in comprising the calling receiving mode, calling out the various patterns of carrying out pattern, logging mode, speech recognition mode and broadcast reception pattern.In running, the audio frequency that 152 outputs of audio frequency output module are relevant with specific function (for example, calling out reception, message sink and mistake).

In addition, the output unit in figure 150 also have be used to send signal or otherwise sign the siren 153 of the particular event that is associated with mobile terminal 100 has occured.Alarm events comprises to be received calling, receive message and receives that the user inputs.The example of this output comprises to the user provides tactilely-perceptible (for example vibration).For example, siren 153 can be configured to receive calling or message and vibrate in response to mobile terminal 100.

As another example, can be by siren 153 in response to receiving that at mobile terminal 100 places the user inputs and vibration is provided, thereby a kind of tactile feedback mechanism is provided.In addition, the various outputs that provided by the assembly of output unit 150 can independently realize, perhaps the combination in any of available these assemblies of this output realizes.

In addition, memory 160 is used to store various types of data to support processing, control and the storage needs of mobile terminal 100.The example of these data is included on mobile terminal 100 program command, call history, contact data, telephone book data, message, picture, video of the application program of operation etc.

in addition, memory 160 shown in Figure 1 can be realized with the suitable volatibility of any type (or combination) and nonvolatile memory or memory device, comprise random access memory (RAM), static RAM (SRAM), EEPROM (Electrically Erasable Programmable Read Only Memo) (EEPROM), EPROM (Erasable Programmable Read Only Memory) (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, disk or CD, card type reservoir, or other similar memory or data storage device.

Terminal 100 also comprises controller 180, and it controls the overall operation of mobile terminal 100 usually.For example, controller carries out operating with audio call, data communication, instant messaging, video call, camera operation and record control and the processing that is associated.As shown in Figure 1, controller 180 can comprise the multi-media module 181 that the multimedia playback function is provided.Multi-media module 181 can be configured to the part of controller 180, and perhaps this module can be implemented as stand-alone assembly.

In addition, power supply 190 provides the required electric power of each assembly of portable set.The electric power that provides can be internal power, external power or its combination.

Next, Fig. 2 is the front view of mobile terminal 100 according to an embodiment of the invention.As shown in Figure 2, mobile terminal 100 comprises and is configured to the first fuselage 200 of being slidably matched with the second fuselage 205.User input unit 130 described in Fig. 1 can comprise the first input unit and the second input unit such as keypad 215 and the 3rd input unit such as side switch 245 such as function key 210.

Function key 210 is associated with the first fuselage 200, and keypad 215 is associated with the second fuselage 205.Keypad comprise make the user can outbound calling, prepare text or Multimedia Message or various keys of operating mobile terminal 100 (for example numeral, character and symbol) otherwise.

In addition, the first fuselage 200 slides opening and closing between the position with respect to the second fuselage 205.When off-position, the first fuselage 200 is location on the second fuselage 205 by this way: keypad 215 is covered by the first fuselage 200 basically or fully.When open position, the user accesses keypad 215 and display 151 and function key 210 becomes possibility.Function key facilitates the user to input such as the order that begins, stops and rolling.

In addition, mobile terminal 100 can be worked under standby mode (for example, can receipt of call or message, reception and response to network control signal) or call active pattern.Usually, mobile terminal 100 moves under standby mode when in the closed position, and moves under activity pattern when open position.Yet this pattern configurations can maybe need change on request.

In addition, the first fuselage 200 is formed by the first shell 220 and second housing 225, and the second fuselage 205 is formed by the first shell 230 and second housing 235.Each first and second shells are formed by suitable rigidity (ridge) material such as the injection moulding plastics usually, perhaps use the metal material such as stainless steel (STS) and titanium (Ti) to form.

If necessary, can the first and second fuselages 200, one of 205 or both the first and second shells between one or more intermediate case are set.In addition, the first and second fuselages 200,205 size are adjusted to the electronic building brick that can hold for the operation of supporting mobile terminal 100.

The first fuselage 200 also comprises the audio output unit 152 of camera 121 and the loud speaker that is configured to locate with respect to display 151.Camera 121 can also this mode consist of: it can optionally locate (for example, rotation, rotation etc.) with respect to the first fuselage 200.

In addition, function key 210 is near the downside location of display 151.As mentioned above, display 151 is implemented as LCD or OLED.Display 151 also can be configured to have the touch-screen that generates the bottom touch pad of signal in response to user's contact (for example, finger, input pen etc.) touch-screen.

The second fuselage 205 also comprises microphone 122 and the side switch 245 with keypad 215 adjacent positioned, and this side switch 245 is the class user input unit along the location, side of the second fuselage 205.Preferably, side switch 245 can be configured to hot key, makes side switch 245 be associated with the specific function of mobile terminal 100.As shown in the figure, interface unit 170 and side switch 245 adjacent positioned, and the power supply 190 of battery forms is positioned at the bottom of the second fuselage 205.

Fig. 3 is the lateral side view of the mobile terminal shown in Fig. 2.As shown in Figure 3, the second fuselage 205 photoflash lamp 250 and speculum 255 of comprising camera 121 and being associated.Photoflash lamp 250 is in conjunction with camera 121 operations of the second fuselage 205, and speculum 255 is used for helping the user to locate camera 121 at self-timer mode.In addition, the camera 121 of the second fuselage 205 towards with 121, the camera of the first fuselage 200 shown in Figure 2 towards the direction of opposite direction.

In addition, the camera 121 of the first and second fuselages can have identical or different ability separately.For example, in one embodiment, the camera 121 of the first fuselage 200 is with the resolution operation more relatively low than the camera 121 of the second fuselage 205.This for example is arranged in during the video conference conversation that return link bandwidth ability wherein is restricted very effective.In addition, the high-resolution relatively of the camera of the second fuselage 205 (Fig. 3) is very useful in order to follow-up use for obtaining the better quality picture.

The second fuselage 205 also comprises the audio frequency output module 152 of the loud speaker that is configured to be positioned at the second fuselage 205 upsides.The first and second fuselages 200,205 audio frequency output module also can cooperate stereo output is provided.In addition, any one or both of these audio frequency output modules can be configured to serve as speaker-phone.

Terminal 100 also comprises broadcast singal reception antenna 260, and it is positioned at the upper end of the second fuselage 205.Antenna 260 and broadcast reception module 111 (Fig. 1) cooperating operation.If necessary, antenna 260 can be fixed, or in second fuselage 205 that is configured to retract.In addition, the dorsal part of the first fuselage 200 comprises the sliding block 265 that is coupled slidably with the corresponding sliding block that is positioned at the second fuselage 205 front sides.

In addition, the first and second fuselages 200,205 various assemblies shown in arrange and can change with needs on request.Usually, part or all in the assembly of a fuselage can replacedly realize on another fuselage.In addition, the position of these assemblies and relative positioning can be positioned at and be different from the position shown in representative figure.

In addition, the mobile terminal 100 of Fig. 1-3 can be configured to operate in the communication system that sends data via frame or grouping, comprises wireless, wired communication system and satellite-based communication system.These communication systems are used different air interface and/or physical layer.

The example of this air interface of being used by communication system comprises for example Long Term Evolution (LTE) and the global system for mobile communications (GSM) of frequency division multiple access (FDMA), time division multiple access (TDMA), code division multiple access (CDMA) and Universal Mobile Telecommunications System (UMTS), UMTS.Only as non-limiting example, further describe and will be referred to cdma communication system, but these instructions similarly are applicable to other system type.

Next, Fig. 4 illustrate have a plurality of mobile terminals 100, the cdma wireless communication system of a plurality of base station 270, a plurality of base station controller (BSC) 275 and mobile switching centre (MSC) 280.

MSC 280 is configured to and public switch telephone network (PSTN) 290 interfaces, and MSC 280 also is configured to 275 interfaces with BSC.In addition, BSC 275 is coupled to base station 270 via back haul link.In addition, back haul link can configure according to any in some known interface, comprises for example E1/T1, ATM, IP, PPP, frame relay, HDSL, ADSL or xDSL.In addition, system can comprise plural BSC 275.

Each base station 270 also can comprise one or more sectors, and each sector has omnidirectional antenna or points to radially antenna away from the specific direction of base station 270.Perhaps, each sector can comprise two antennas for diversity reception.In addition, each base station 270 can be configured to support a plurality of frequencies to distribute, and each frequency distribution has specific frequency spectrum (for example, 1.25MHz, 5MHz).

The common factor that sector and frequency are distributed is called as CDMA Channel.Base station 270 also can be called as base station transceiver subsystem (BTS).In some cases, term " base station " can be used for the logical BSC 275 of finger and one or more base station 270.

The base station also can be expressed as " cell site (cell site) ".Perhaps, each sector of given base station 270 can be called as cell site.In addition, T-DMB (DMB) transmitter 295 is illustrated as to mobile terminal 100 broadcasting that are operated in this system.

In addition, the broadcast reception module 111 (Fig. 1) of mobile terminal 100 is configured to receive the broadcast singal by 295 emissions of DMB transmitter usually.As mentioned above, can realize the broadcast and multicast signaling of other type similarly arranging.

Fig. 4 also shows some global positioning systems (GPS) satellite 300.These satellites help to locate the position of a part or all mobile terminals 100.Figure 4 illustrates two satellites, still, can obtain locating information with more or less satellite.

In addition, the locating module 115 (Fig. 1) of mobile terminal 100 is configured to cooperate with satellite 300 positional information that obtains to expect usually.Yet, perhaps also can realize the location detecting technology of other type, such as adding to or the location technology of alternative GPS location technology.A part of or whole gps satellite 300 optionally or additionally is configured to provide satellite dmb to transmit.

In addition, during the typical operation of wireless communication system, base station 270 receives many group reverse link signal from each mobile terminal 100.Mobile terminal 100 is called out, messaging and other communication.

In addition, each reverse link signal that 270 interior processing are received by given base station 270 in the base station, and the data obtained is forwarded to the BSC 275 that is associated.The mobile management function that BSC provides call resources to distribute and comprise the soft handover between base station 270.

In addition, BSC 275 also routes to MSC 280 with the data of receiving, MSC 280 provides additional route service to be used for and PSTN 290 interfaces.Similarly, PSTN and MSC 280 interfaces, and MSC 280 and BSC 275 interfaces.BSC 275 also controls base station 270, sends many group forward link signals to mobile terminal 100.

In the following description, explain the control method of the mobile terminal 100 that is applicable to above configuration with reference to each embodiment.Yet following embodiment can realize separately or realize by its combination.In addition, in the following description, suppose that display 151 comprises touch-screen.In addition, touch-screen or its screen can be by Reference numeral ' 400 ' indications.

In an embodiment of the present invention, a kind of terminal is appointed as the territory (or information search scope) of identifying the database of benchmark as voice command with certain menu or serves relevant territory.Therefore, the discrimination of voice command improves, and is reduced by the total resources that mobile terminal uses.

In addition, can specify by the environment setup menu of mobile terminal as the territory of the database of the benchmark of speech recognition.Equally, in case speech identifying function is activated, the territory of appointment is automatically applied.

Hereinafter, the territory of presetting of supposing to be used for the database of voice command identification comprises the information relevant with the menu that shows on current display 151, or the information relevant with the submenu of one of all menus.

Next, Fig. 5 illustrates the flow chart that passes through the mobile teminal menu control method of voice command according to an embodiment of the invention.In the following description also with reference to Fig. 1.As shown in Figure 5, controller 180 is determined speech identifying function whether be activated (S101).

In addition, speech identifying function can select hardware button on mobile terminal or the soft touch button on display module 151 to activate by the user.The user also can come voice activated recognition function by handling the certain menu that shows on display 151.Speech identifying function also can generate specific sound or sound effect, activate apart from wireless signal or by the user's limbs information such as gesture or figure by short distance or length by the user.

In more detail, specific sound or sound effect can comprise having level other strike note not higher than a specific order.In addition, specific sound or sound effect can utilize the sound level detection algorithm to detect.In addition, the sound level detection algorithm is preferably simpler than speech recognition algorithm, therefore consumes less mobile terminal resource, and is same, sound level detection algorithm (or circuit) can be realized individually by speech recognition algorithm or circuit, maybe can be implemented as the partial function of specified speech recognizer.

In addition, wireless signal can receive by wireless communication unit 110, and user's gesture or figure can receive by sensing cell 140.Therefore, in an embodiment of the present invention, wireless communication unit 110, user input unit 130 and sensing cell 140 can be called as signal input unit.In addition, speech identifying function can also stop in a similar fashion.

Making the user is particularly advantageous with the voice activated recognition function of physics mode, because the user can recognize that more they will come control terminal with voice command.That is, because at first the user need to carry out the physical manipulation to terminal, so he or she recognizes that intuitively they will input voice command or instruction to terminal, thereby therefore say clearlyer or slower activation specific function.Therefore, for example, because the user says clearlyer or be slower, the probability of accurately identifying phonetic order increases.That is, in an embodiment of the present invention, the activation of speech identifying function is carried out by the physical manipulation of button on terminal, rather than by speech comes voice activated recognition function to terminal.

In addition, controller 180 time span etc. that can touch based on number of times, the user that the user touches the part of specific button or touch-screen the part of specific button or touch-screen begins or the activation of terminated speech recognition function.The user also can arrange controller 180 and how utilize by the next voice activated recognition function of suitable menu option provided by the invention.For example, the user can select the menu option on terminal, and it comprises 1) activation, 2 of speech recognition is set based on the selecteed number of times X of voice activation button) activation, 3 of speech recognition is set based on the selecteed time quantum X of voice activation button) activation etc. of speech recognition is set as button X and Y when selected.So the user can input the value of X and Y, in order to being set convertibly, controller 180 how to confirm voice activation functions are activated.Therefore, according to embodiments of the invention, the user uses the speech identifying function of its mobile terminal energetically, and this has increased the probability that controller 180 is determined the correct function corresponding with the user speech instruction, and this also allows the user according to his or her needs modification voice activation function.

Keep the state of activation of speech identifying function when controller 180 also can be touched at the button of appointment or select, and stop speech identifying function when the button of appointment is decontroled.Perhaps, controller 180 can be kept the predetermined time interval with the activation of speech identifying function after designated button is touched or selects, and stops when predetermined time interval finishes or the terminated speech recognition function.In yet another embodiment, controller 180 can be stored in the phonetic order that receives in memory 160, simultaneously speech identifying function is maintained state of activation.

In addition, as shown in Figure 5, as the territory of the database of the benchmark of the implication of voice command recognition be assigned to terminal on specific function or the relevant information (S102) of menu.For example, the special domain of database can be the information relevant with the menu that shows on current display 151, or the information relevant with the submenu of one of shown menu.In addition, because the territory of database is designated, so the discrimination of input voice command improves.The calling territory that the example in territory comprises free email domain, receive and multimedia domain, MMD etc.

Equally, the information relevant with submenu can be configured to the data in database.For example, information can be configured to the form of keyword, and a plurality of information can be corresponding to a function or menu.In addition, according to the characteristics of information, database can be a plurality of databases, and can be stored in memory 160

In addition, the information in database can advantageously be upgraded or renovate by learning process.Each territory of associated databases also can be designated as the territory relevant with the current function that just is being output or menu, in order to improve the discrimination of voice command.This territory also can be along with the menu step moves on and changes.

In case speech identifying function is activated (being in S101) and territory designated (S102), controller 180 just determines whether the user has inputted voice command (S103).When controller 180 determines that the user has inputted voice command (being in S103), controller 180 passes through the voice command of microphone 122 inputs or background and the content of instruction based on specific database analysis, thus the implication (S104) of judgement voice command.

In addition, controller 180 implication that can determine phonetic order or order based on language model and the acoustic model in the territory of accessing.In more detail, language model relates to word itself, and acoustic model is corresponding to the mode of saying word (for example, the frequency component of institute's excuse or phrase).Use together with the state of language and acoustic model and special domain and mobile terminal 100, controller 180 can be determined the implication of input phonetic order or order expeditiously.

In addition, when controller 180 will be inputted voice command and be stored in memory 160, controller 180 can begin to judge the process of the implication of inputting voice command immediately when the user removes the activation of speech identifying function, perhaps can carry out simultaneously the voice activation function when voice command is transfused to.

In addition, if voice command also by input (in S103 no) fully, controller 180 also can be carried out other function.For example, if the user carries out another action by touching menu option etc., or press button on terminal (being in S109), controller 180 is carried out corresponding selected functions (S110).

In addition, after step S104 determined the implication of input voice command, controller 180 was exported the end value (S105) of implications at controller 180.That is, end value can comprise for carry out with corresponding to the function of the definite implication of institute or serve relevant menu, for the control signal of the specific components of control mobile terminal etc.End value also can comprise for the data that show the information relevant with the voice command of identifying.

Controller also can ask the user to confirm Output rusults value whether correctly (S106).For example, when voice command has low discrimination or is determined when having a plurality of implication, then the exportable a plurality of menus relevant with corresponding meaning of controller 180 carry out the menu (S107) of being selected by the user.Equally, controller 180 can inquire whether the user will carry out the certain menu with high discrimination, then carries out according to user's selection or response or shows corresponding function or menu.

In addition, controller 180 is exportable speech message also, selects concrete menu or option with the request user, and for example " do you want to carry out the photograph album menu? answer is or is no ".Then, controller 180 responds the function of carrying out or not carrying out corresponding to concrete menu or option based on the user.If the user did not respond in the concrete time interval (for example, 5 seconds), controller 180 also can be carried out concrete menu or option immediately.That is, if not from user's response, controller 180 can come automatic executing function or menu by being judged as affirmative acknowledgement (ACK) without response.

In addition, the user can utilize his or her voice (for example, be or no) or via answering the problem of self-controller 180 such as hardware or other input unit such as software push buttons, touch pad.In addition, at step S106, if negative acknowledge from the user (in S106 no) is arranged, that is, if the implication of voice command do not judged exactly, controller 180 can be carried out additional mistake treatment step (S108).

That is, the mistake treatment step can be carried out by the input that again receives voice command, maybe can have a plurality of menus higher than other discrimination of a specific order by demonstration and maybe can be judged as a plurality of menus with similar meaning and carry out.Then one of the optional majority of user menu.Equally, when having quantity higher than the function of other discrimination of a specific order or menu less than predetermined quantity (for example, 2), controller 180 can automatically perform corresponding function or menu.

Next, Fig. 6 A is the general survey of method that the speech identifying function of activation mobile terminal according to an embodiment of the invention is shown.As shown in display screen 410, the user can be by touching the voice activated recognition function of soft key 411.The user also can come the terminated speech recognition function by decontroling soft key 411.More specifically, the user can be by touching the voice activated recognition function of soft key 411, and continue to touch soft key 411 or hard button 412 until complete phonetic order.That is, the user can decontrol soft key 411 or hard button 412 when completing phonetic order.Therefore when controller 180 knows that phonetic order will be transfused to and when phonetic order is completed.As mentioned above, because participating in this directly, the user determines, so the accuracy of the explanation of input voice command increases.

Controller 180 also can be configured to for example startup of identification voice activation functional part when the user touches soft key 411 for the first time, and then the identification phonetic order is completed when the user touches soft key 411 again.Other system of selection is also possible.In addition, as shown in the display screen 410 in Fig. 6 A, except using soft key 411, voice activation and inactive can the execution by the hard button 412 of handling on terminal.

In addition, soft key 411 shown in display screen 410 can be the single soft key that the user presses or decontrol to activate/stop using speech identifying function, can be perhaps the menu button that produces when selected such as the menu list of " 1. beginning voice activation and 2. stops voice activation ".For example, soft key 411 can also show during holding state.

In another example, as shown in display screen 420, the user also can activate and inactive speech identifying function by the optional position of touch screen.Display screen 430 illustrates another example, and wherein the user is by producing higher than other specific sound of a specific order or audio activates and inactive speech identifying function.For example, the user can clap hands to produce this strike note.

Therefore, according to one embodiment of present invention, speech identifying function can be realized by two kinds of patterns.For example, speech identifying function can be realized as for detection of concrete sound or audio and also determine the second pattern of the implication of voice command higher than a certain other first mode of level and for voice command recognition.If sound or audio are higher than a certain rank in first mode, thus the second pattern voice command recognition that is activated.

Display screen 440 illustrates another method of user's activation and the speech identifying function of stopping using.In this example, the limb motion that controller 180 is configured to interpreting user begins or stops the voice activation function.For example, as shown in display screen 440, controller 180 can be configured to the user is moved to display the instruction that hand is interpreted as wanting voice activated recognition function, and the user is removed hand from display the instruction that is interpreted as wanting the terminated speech recognition function.Short range or remote wireless signals also can be used for starting and stopping speech identifying function.

Therefore, according to embodiments of the invention, because the voice activation function is activated and stops, so speech identifying function is not carried out continuously.That is, when speech identifying function was remained state of activation continuously, the amount of comparing the resource on mobile terminal with embodiments of the invention increased.

In addition, as discussed above in reference to Figure 5, when speech identifying function was activated, controller 180 was appointed as the territory relevant with the menu list on display 151 with the territory as the certain database of the benchmark of voice command identification.Yet if select or carry out specific menu from menu list, the territory of database can be designated as the information relevant with the selected menu of certain menu or submenu.

In addition, when by voice command or touch input selection or execution certain menu, controller 180 can be with the formal output of speech message or pop-up window or the balloon help information relevant with the submenu of certain menu.For example, as shown in Fig. 6 B, when the user selected " multimedia menu " via touch or voice operating, controller 180 was shown as the help information 441 of balloon with the information relevant with the submenu (for example, broadcasting, camera, line-based browser, game etc.) of " multimedia menu ".Perhaps, the exportable voice signal 442 that comprises help information of controller 180.Then the user can utilize voice command or select one of shown help options by touch operation.

Fig. 6 C illustrates the embodiment that the user utilizes his or her limb motion (being user's gesture in this example) choice menus item.More specifically, when the user moved on to more close menu item 443 with his or her finger, controller 180 showed the submenu 444 relevant with menu 443.For example, controller 180 can be via sensing cell 140 identification users' limbs mobile message.In addition, shown help information can be shown as having transparency or the brightness of controlling according to user's distance.That is, along with user's hand is more and more approaching, shown item can further be highlighted.

As discussed above, controller 180 can be configured to determine the startup of speech identifying function and stop based on various method.For example, the user can select/handle soft or hard button, the optional position on the touch touch-screen etc.Controller 180 also can keep the activation of speech identifying function the predetermined time amount, and then the end's automatic stop in scheduled time amount activates.Equally, controller 180 can only keep activating when carrying out specific button or touch operation, and then automatic stop activates when input is disengaged.Controller 180 also can stop activation when no longer inputting voice command and reach the certain hour amount.

Next, Fig. 7 A is the flow chart that the method for the voice command in identification mobile terminal according to an embodiment of the invention is shown.With reference to figure 7A, when speech identifying function was activated, controller 180 was appointed as the territory (S201) relevant with the submenu of the menu that shows or this menu on display 151 with the territory that can be used as the database of voice command identification benchmark.The user also utilizes accurately pad name or utilizes natural language (for example, Oral English Practice) input voice command (S202).

Then controller 180 is stored in (S203) in memory 160 with the voice command of input.In addition, when input voice command under specified territory, controller 180 is by using speech recognition algorithm based on background and the content of specified domain analysis voice command.Equally, voice command can be converted into the text category information for analyzing (S204), then is stored in the certain database of memory 160.Yet, can omit the step that voice command is transformed into the text category information.

Then, for background and the content of analyzing speech order, controller 180 detects specific word or the keyword (S205) of voice command.Based on the word that detects or keyword, background and the content of controller 180 analyzing speech orders, and by determine or judge the implication of voice command with reference to institute's canned data in certain database.

In addition, as discussed above, comprise special domain as the database of benchmark, and function or the menu corresponding with the implication of the voice command of based on database judgement are performed (S207).Equally, be designated as each information relevant with certain menu because be used for the database of speech recognition, thus the raising of the speed of discrimination and voice command recognition, and the stock number of using on terminal reduces, in addition, the matching degree that presets title of discrimination indication and certain menu.

The discrimination of input voice command also can judge according to a plurality of information relevant with the specific function of voice command or menu.Therefore, when accurately coupling was included in specific function in voice command or menu (for example, menu identity), the discrimination of input voice command improved when information.

In more detail, Fig. 7 B is the general survey of method that the voice command of identification mobile terminal according to an embodiment of the invention is shown.As shown in Fig. 7 B, the voice command as natural language " I want to see my pictures (I want to see my picture) " that user's input is comprised of six words.In this example, discrimination can judge based on a plurality of meaningful word (for example, see, picture) relevant with certain menu (for example, photograph album).In addition, whether controller 180 word that can determine to be included in voice command based on the information that is stored in database is the significant word relevant with specific function or menu.For example, be included in the natural-sounding voice command, with the irrelevant insignificant word of certain menu can be subject (I), preposition (to) and possessive pronoun (my).

Equally, natural language is by the general language of people, and has the concept opposite with artificial language.In addition, natural language can utilize the natural language processing algorithm to process.Natural language can comprise or can not comprise the accurate title relevant with certain menu, and this causes the difficulty when voice command recognition fully accurately sometimes.Therefore, according to embodiments of the invention, when voice command had than the high discrimination of a certain rank (for example, 80%), controller 180 these identifications of judgement were accurate.

In addition, when controller 180 a plurality of menus of judgement had similar implication, controller 180 showed these a plurality of menus, and the user can select one of shown menu so that its function is performed.In addition, can show at first that the menu with relatively high discrimination or compare with other menu discriminatively shows.

For example, Fig. 8 is the general survey of method that the menu of the phonetic recognization rate for showing mobile terminal according to an embodiment of the invention is shown.As shown in Figure 8, the menu icon with higher discrimination is displayed on the core of display screen 510, perhaps shows with larger size or darker color as shown in display screen 520.Also can at first show the menu icon with higher discrimination, be then to hang down the discrimination menu successively or in order.

In addition, at least one in the size that controller 180 can be by changing menu, position, color, brightness or highlight by the order with higher discrimination and show discriminatively a plurality of menus.The transparency of menu also can suitably be changed or be controlled.

In addition, as shown in the bottom of Fig. 8, the menu with high user selection rate can be updated or be arranged to have discrimination.That is, the history (S301) that controller 180 storage users select is also carried out learning process (S302), thereby upgrades the concrete discrimination (S303) of the menu option that the number of times selected by the user Duos than other menu option.Therefore, the number of times that frequently uses menu to be selected by the user can be applied to the discrimination of menu.Therefore, select the number of times of concrete menu according to the user, the voice command of inputting with pronunciation or the same or similar mode of content can have different discriminations.

In addition, controller 180 also can be stored the time that the user carries out concrete function.For example, user's message that can on Monday check e-mails or miss when wake up Friday.This temporal information also can be used for improving discrimination.The state of terminal (for example, standby mode etc.) also can be used for improving discrimination.For example, when opening their mobile terminal for the first time, when this terminal is opened from off-position etc., the message that the user can check e-mails or miss.

Then, Fig. 9 is the general survey of method that the voice command of identification mobile terminal according to another embodiment of the invention is shown.As shown in Figure 9, the voice activated recognition function of user, and input voice command " I want to see my pictures (I want to see my picture) ".Then controller 180 is appointed as the territory relevant with shown submenu with the territory that is used for the database of voice command identification.In this example, controller 180 is clarifying voice commands (S401) then, shows a plurality of menus (S402) that have greater than the probability of occurrence (80%).As shown in the display screen 610 in Fig. 9, controller shows four multimedia menus.

Controller 180 also shows the menu (for example, " photograph album " menu option 621 in this example) with maximum probability discriminatively.Then the user can select any in shown menu to carry out the function corresponding to selected menu.In example shown in Figure 9, the user selects photograph album menu option 621, and the picture in the controller 180 selected photograph albums of demonstration, as shown in display screen 620.

Equally, as shown in the step S402 in Fig. 9 bottom, when only having single menu to be confirmed as higher than predetermined probability, controller 180 also can be carried out function immediately.That is, when photograph album menu setecting 621 was confirmed as being the unique menu that has higher than the discrimination of predetermined threshold or probability, the user needn't choice menus photograph album menu option 621 controllers 180 just shows the picture in photograph album immediately, as shown in display screen 620.In addition, even menu has the clear and definite title such as " photograph album ", memory 160 also can be stored a plurality of information relevant with this menu, such as " photo, picture, photograph album ".

In addition, discuss with reference to figure 6B as above, when specific menu according to mode of operation or pattern (for example, being used to indicate the pattern of speech identifying function) by voice command or touch that input is selected or when carrying out, controller 180 also can be exported to the user with help information.In addition, the user can utilize the suitable menu option that is arranged in environment setup menu to come the setting operation pattern, is used for exporting help.Therefore, the user can be in the situation that do not need or do not have senior technical ability to operate terminal of the present invention.That is, a lot of old men may not experience a plurality of different menu that arranges in operating terminal.Yet, utilizing terminal of the present invention, the user who generally is unfamiliar with the complicated user interface of terminal setting can easily operate this mobile terminal.

In addition, when controller 180 voice command is identified as when having a plurality of implication (, when the natural language speech order does not comprise clear and definite menu identity, such as being included in when menu in " multimedia " category but when not having the clear and definite title of one of " camera ", " photograph album " and " video "), controller 180 shows a plurality of menus that have higher than the discrimination of a certain value (for example, 80%).

Next, Figure 10 is according to one embodiment of present invention by the general survey of controller 180 for a plurality of databases of the voice command of identification mobile terminal.In this embodiment, database storage control 180 is used for the information of the implication of judgement voice command, and can be any amount of database according to information characteristics.In addition, can upgrade by continuous learning process under the control of controller 180 according to the associated databases of information characteristics configuration.

For example, learning process attempts user's voice are complementary with corresponding word.For example, when the Korean of being said by the user " Saeng-il " (referring to " birthday ") is misinterpreted as " Saeng-hwal " when (referring to " life "), the user is revised as " Saeng-il " with this word " Saeng-hwal ".Therefore, the same pronunciation of being inputted by the user afterwards will be identified as " Saeng-il ".

As shown in figure 10, the associated databases according to information characteristics comprises the first database 161, the second database 162, the 3rd database 163 and the 4th database 164.In this embodiment, the first database 161 is stored the voice messaging that is used for identifying the voice of inputting by microphone take phoneme, syllable or morpheme as unit.The second database 162 storages are based on the information (for example, grammer, pronunciation accuracy, sentence structure etc.) of the whole implication of the voice messaging judgement voice command of identifying.The function of the 3rd database 163 storage and mobile terminal or the relevant information of the menu of service, and the 4th database 164 storages are from message or the voice messaging of mobile terminal output, so that receive the user's confirmation about implication that voice command judges.

In addition, can according to the territory of presetting for voice command identification, the 3rd database 163 be appointed as the information relevant with the menu of specific category.Equally, but corresponding database also stored sound (pronunciation) information and phoneme, syllable, morpheme, word, keyword or the sentence corresponding with pronunciation information.Therefore, controller 180 can be by determining or judge the implication of voice command with at least one in a plurality of databases 161 to 164, and carry out with the function of the implication that judges corresponding to voice command or serve relevant menu.

Next, Figure 11 is the general survey that the state that the speech identifying function of mobile terminal according to an embodiment of the invention is being performed is shown.As shown in the figure, when controller 180 was carried out speech identifying function, controller 180 showed certain indicators or icon 500, and its notice user speech recognition function is performed.Controller 180 is exportable sound or message also, is performed with notice user speech recognition function.

In addition, the above embodiments relate to the phonetic order of identifying the user.Yet the present invention also is applicable to the user and carries out the additional input function when phonetic order is identified.For example, speech recognition and touch input, speech recognition and button input or speech recognition or touch/button input can be carried out simultaneously.

In addition, controller 180 can prevent speech identifying function with AD HOC or menu or carry out under particular operational state.In addition, audio-frequency information (for example, verbal announcement or tutorial message) or the video information (for example, the designator in Figure 11 500) that is being employed of indication speech identifying function can show under speech recognition mode, menu or mode of operation.Equally, the information exchange of using speech identifying function can be crossed the output help information and offer the user.

Figure 12 illustrates the general survey of processing the method for the subcommand relevant with the certain menu of mobile terminal by voice command according to an embodiment of the invention.In this embodiment, suppose voice activated recognition function of user.

Then, as shown in the left side of Figure 12, the user touches alarm clock/schedule icon, and controller 180 demonstration ejection help menus, and it lists available function (for example, 1) alarm clock, 2) schedule, 3) plan target and 4) memorandum).Then, user input voice order " plan target ", and the implication of controller 180 clarifying voice commands and show a plurality of menus be confirmed as corresponding to voice command are as shown in display screen 611.

That is, as shown in display screen 611, controller 180 shows four events relevant with the plan target function.Then the user inputs voice command and " selects the 2nd ", and controller 180 is selected the 2nd option (meeting 1).Then the user inputs voice command " I want to delete it ".Then controller 180 shows popup menu 613, and the request user is confirmed to be about the deletion entry or is no.Then user input voice order "Yes", then controller 180 deletes entry, as shown in the display screen 616 of Figure 12.

In addition, if not from user's response, controller 180 can automatically perform subcommand by response is judged as affirmative acknowledgement (ACK).Controller 180 is also exported voice command 615, notifies this item of user deleted.Equally, except passing through to touch menu setecting first menu alarm clock/schedule, the user can send another voice command instead.Equally, when at first the user selected alarm clock/schedule icon to notify the corresponding task of user to be performed, controller 180 can send speech message 617.

In addition, as discussed above, when certain menu was performed, controller 180 was appointed as the territory relevant with performed menu with the territory of identifying the database of benchmark as voice command.That is, this territory comprises the information relevant with the submenu of certain menu, or the information relevant with the subcommand of carrying out from certain menu.

Next, Figure 13 illustrates general survey of searching for the method for subway maps in mobile terminal by voice command according to an embodiment of the invention.In this example, again suppose voice activated recognition function of user.In addition, also suppose controller 180 based on user's voice command or utilize the manipulation of other input unit to carry out and the certain menu that shows that subway maps is relevant.

That is, controller 180 shows subway maps as shown in display screen 621.As discussed above, when certain menu is performed, controller 180 can be appointed as the territory of identifying the database of benchmark as voice command the territory relevant with performed menu (for example, distance (time) information between the title of subway station, each station).In addition, this territory comprises the information relevant with the submenu of certain menu, or with can be from the relevant territory of the subcommand that certain menu is carried out.

Then controller 180 sends voice command 626, and the request user inputs initial or the terminus.Then the user selects two stations on display screen 621.That is, controller 180 receives two

stations

622 and 623 from shown subway maps, and the user wonders through these two time quantums that the station is required.When by terminal (that is, say initial sum terminus) or when touching two

station

622 and 623 promptings, the user can utilize voice command to select two stations.It is also possible selecting other method at two stations.After the user selects two stations, controller 180 output speech messages 624, it comprises two stations (that is, ISU and Seoul station is selected) of selecting via loud speaker.Equally, except the output speech message, controller 180 can show that band is asked to some extent or the pop-up window of input message instead.

In addition, when two stations were selected, controller 180 is exportable help information also.For example, as shown in the display screen 621 in Figure 13, the controller display column goes out the help of name of station and subway line color and ejects the balloon window.Then the user asked through the required time of two institutes selective calling.The user can ask this information by input phonetic order " Wish i knew from ISU to Seoul station will with how long ".

Controller 180 then detect with the territory in process the information-related significant word of subway maps (for example, how long, use, Isu, Seoul stands) so that the background of analyzing speech order and content.Based on the information of analyzing, controller 180 determines that voice command has the implication of the temporal information between two subway station Isu of request and Seoul station.

In addition, when the implication of controller 180 judgement phonetic orders, at first controller 180 can ask the user to confirm whether the implication of the voice command that judges is accurate.Then controller 180 shows these two stations on subway maps, be communicated with the distance (or time) between two stations, the station number between two stations etc., and output speech message 627, notifies subscriber-related result as shown in the display screen 625 in Figure 13.In addition, as mentioned above, if the user within the concrete time interval to confirming that request is responding, controller 180 can be interpreted as it affirmative acknowledgement (ACK) and the result of institute's request service is provided.

Next, Figure 14 illustrates the general survey of passing through the method for voice command multimedia rendering file in mobile terminal according to an embodiment of the invention.In addition, following description hypothesis user has inputted activation control signal, and the controller 180 voice activated recognition functions of beginning.Also suppose controller 180 input by receiving voice command or utilize the user of other input unit to handle to carry out the certain menu relevant with the multimedia reproduction menu.

That is, as shown in display screen 631, controller 180 shows that the user can select the list of songs of playing.Therefore, in the present invention, the multimedia file of user's expectation can directly search by voice command, and reproduces thus.More specifically, in case carry out the multimedia reproduction menu, controller 180 just is appointed as the territory relevant with performed menu with the territory of identifying the database of benchmark as voice command.

As mentioned above, the territory comprise the information relevant with the submenu of multimedia reproduction menu, with can from the relevant information of the subcommand that the multimedia reproduction menu is carried out or with multimedia file relevant information (for example, filename, recovery time, copyright owner etc.).

In addition, the input that controller 180 can be by receiving voice command or utilize the user of other input unit to handle to show the multimedia file list.In the example of Figure 14, as shown in display screen 631, selecting from listed files under the state of a file, the user inputs its natural language speech order (for example, let us is play this first song).

In case voice command is transfused to, controller 180 is used for treatment of selected menu in the territory with regard to the detection significant word relevant with submenu or subcommand (for example, broadcast, this first song).In addition, whole background and the content implication that judge voice command of controller 180 by analyzing detected word and voice command.

In case just receiving the user, the implication of judgement voice command, controller 180 whether confirm accurately about the implication of the voice command that judges.For example, as shown in figure 13, controller 180 shows pop-up window 633, and the request user says "Yes" or "No" about the broadcast of selected song.Controller is exportable speech message 632 also, and whether inquiry user song 2 is the songs that will play.Then the user can say "Yes", then the song shown in controller 180 outputs, as shown in display screen 634.

Perhaps, controller 180 can be play selected song automatically, and does not ask the user to confirm to select.The user also can use suitable menu option, and controller 180 requests are about the confirmation of selected task or do not ask confirmation to be set to acquiescence.In addition, if not from user's response, controller 180 can automatically perform the voice command that judges by response being judged as affirmative acknowledgement (ACK).

Therefore, in this embodiment, the file that selection will be reproduced, and input the reproduction order of selected file by voice command.Yet, when user's known file name, can be by voice command with filename directly input from parent menu.

Next, Figure 15 illustrates the general survey that sends the method for Email or text message in mobile terminal by voice command according to an embodiment of the invention.Hypothesis has been inputted activation control signal again, the controller 180 voice activated recognition functions of beginning, and controller 180 input or the user that utilizes other input unit by receiving voice command handle to carry out certain menu (for example, mails/message sending/receiving menu) and describe the present embodiment.。

More specifically, in case carry out mail (or message) sending/receiving menu, controller 180 just will be appointed as the territory relevant with performed menu as the database of voice command identification benchmark.This territory comprise the information relevant with the submenu of mails/message sending/receiving menu, with can be from the relevant information of the subcommand that mails/message sending/receiving menu is carried out, the information (for example, transmitter, receiver, sending/receiving time, title etc.) relevant with the sending/receiving mails/message.

The also input by receiving voice command or utilize the user of other input unit to handle to show the list of mails/message sending/receiving of controller 180.As shown in display screen 641, user input voice instruction " I want to reply ".Then controller 180 shows the recoverable message of user that receives, as shown in display screen 645.In this example, as shown in display screen 645, selecting under the state of a mails/message from the mails/message list, the user uses its natural language (for example, replying this message).

In addition, in case voice command is transfused to, controller 180 just detects and the reply of selected mails/message in the territory is processed relevant significant word (for example, reply, this message).Then, controller 180 is by analyzing word and the whole background of voice command and the implication (carry out mails/message and reply menu) that context judges voice command that detects.

In case whether accurately the implication of judgement voice command, controller 180 just can receive the user about the confirmation of the implication of the voice command that judges.For example, for user's confirmation, exportable speech message 642, perhaps exportable text class message 643.During the message the user confirmed when output needle, the user can reply by voice or other input unit.If from user's response, controller 180 can not automatically perform the function corresponding with the implication that judges by response being judged as affirmative acknowledgement (ACK).Then, when carrying out mails/message reply menu, controller 180 writes at mails/message the address/phone number of automatically inputting selected calling party in window 644.

Therefore, in this embodiment, at first select the mails/message that to reply, and utilize the commands in return of the selected mails/message of voice command input.Yet, when the user knows information about calling party, the mails/message of calling party is replied and can directly be inputted by voice command.

In addition, embodiment shown in Figure 15 can be modified with corresponding to sending text message.More specifically, controller 180 comprises that the speech conversion with the user becomes the software of text, make the user can tell terminal he or she what is thought, and controller 180 will be inputted speech conversion and become text message.Controller 180 also can show text through conversion to the user, so the user can confirm that this conversion is acceptable.Then but the user requesting terminal sends to desired user with text message.

Modified embodiment is particularly advantageous, because be to require great effort very much and dull process with the hand input of text messages.Due to a lot of differences, a lot of users want to send text message rather than calling party, but do not want to experience the effort process that a plurality of keys of manual selection send single text message.Modified embodiment of the present invention makes the user can utilize the text message of its phonetic entry expectation, then text message is sent to expectation side.

Figure 16 illustrates general survey of carrying out the method for call in mobile terminal by voice command according to an embodiment of the invention.Be similar to above embodiment, this embodiment supposes that also the user has inputted activation control signal, controller 180 is voice activated recognition function, and the input by receiving voice command of controller 180 or the user who utilizes other input unit operate to carry out the certain menu relevant with call telephone directory book or the menu list of nearest receipt of call (for example, about).

In case the menu about call is performed, controller 180 is appointed as the territory relevant with call with the territory as the database of the benchmark of voice command identification.In addition, this territory comprises and makes a call, incoming call, misses the relevant information such as calling, and each phone relevant information (for example, initiation time, incoming call time, transmitter, receiver, call duration, calling frequency etc.).

In addition, controller 180 by receiving voice command input or utilize the user of other input unit to handle to show phone call list.That is, the user uses his or her natural language input voice command (for example, I want to see the call that receives), as shown in display screen 711.

In case input voice command, controller 180 just (for example detects the significant word relevant with the call in the territory, see, receive, phone, calling), and judge that by analyzing the word that detects and whole background and the content of voice command voice command has the implication of " call that output receives ".In case the implication of voice command is judged, controller 180 is with regard to the list of output needle to the call that receives, as shown in display screen 712.

In addition, then the user inputs voice command and " calls out this people " under the state of one of selection from output listing.As a result, controller 180 judgement voice commands have the implication of " the other side who calls out selected receipt of call ".Then, whether controller 180 receives users and confirms accurately about the implication of the voice command that judges.That is, the exportable speech message 713 of controller 180 or text class message 715.

The user also can reply by voice or other input unit.As mentioned above, if not from user's response, controller 180 can automatically perform the function corresponding with the implication that judges by response being judged as affirmative acknowledgement (ACK).Controller 180 is also exported indicating call and is connected ongoing message 714.

Therefore, in this embodiment, select calling party from phone call list, and input call command to selected calling party by voice command.Yet, when the user has known information about calling party, can directly carry out calling to this people by voice command.

Next, Figure 17 illustrates the general survey of using the method for phone book information in mobile terminal by voice command according to an embodiment of the invention.Make in the description here with above other embodiment in identical hypothesis is described.Namely, in a single day suppose to input active control information, controller 180 just begins voice activated recognition function, and controller 180 is by receiving the voice command input or utilizing the user of other input unit handle to select or (for example carry out certain menu, telephone directory menu), as shown in display screen 720.

In case execution telephone directory menu, controller 180 just are designated as the territory as the database of voice command identification benchmark the territory relevant with the submenu of the telephone directory menu that can carry out from telephone directory menu or subcommand.In addition, the territory is designated in order to improve discrimination, but and nonessential appointment.

In addition, under holding state or the menu selecteed state relevant with telephone directory, the user is with its natural language input voice command (for example, editor James adds James, searches James, calls out James, and I want to send out message to James).In case input voice command, controller 180 just detect significant word relevant with call in the territory, and word and the whole background of voice command and the implication separately that content judges voice command by analyzing and testing.

In case the implication separately of judgement voice command, controller 180 is just carried out function or the menu corresponding with respective voice, as shown in display screen 722 to 724.In addition, before carrying out, whether controller 180 can receive user's the voice command implication about judging and confirm accurately.As mentioned above, for user's confirmation, exportable speech message or text class message.

In addition, during the message the user confirmed when output needle, the user can reply by voice or other input unit.If from user's response, controller 180 can not automatically perform the function corresponding with the judgement implication by response being judged as affirmative acknowledgement (ACK).

Next, Figure 18 illustrates the general survey that changes the method for rear projection screen in mobile terminal by voice command according to an embodiment of the invention.This description is supposed again: in case input active control information controller 180 just begins voice activated recognition function, and handle execution certain menu (for example, photograph album menu) by the input that receives voice command or the user who utilizes other input unit.

The photograph album menu can be by voice command input or utilize the multi-step submenu of other input unit to carry out.Equally, the photograph album menu can directly be carried out by the natural language verbal order (for example, I want to see my photograph album), as shown in display screen 731.According to the judgement implication of voice command, controller 180 is by carrying out the list of photograph album menu output photo, as shown in display screen 732.Then, controller 180 receives a photo of selecting from the album list of output.

Under this state, if input user voice command (for example, changing my wallpaper with this picture), controller 180 detects the meaningful information (for example, change, wallpaper) relevant with the submenu of performed menu or subcommand.Then, controller 180 by analyzing and testing to word and the whole background of voice command and the implication that content judges voice command.That is, controller 180 judgement voice commands have the implication of " rear projection screen is become selected photo ".

In case the implication of judgement voice command, controller 180 just show the rear projection screen corresponding with selected photo, and receive the user about the confirmation whether accurately of the implication of the voice command that judges.Here, for the user confirms, exportable speech message 733, perhaps exportable text class message 734.The voice command that judges also can be according to high discrimination or predetermined environment setup menu in the situation that do not have the user to confirm directly to carry out.

When output was used for the message of user's confirmation, the user can reply by voice or other input unit.If from user's response, controller 180 can not automatically perform the function corresponding with the voice command that judges by response being judged as affirmative acknowledgement (ACK).

In order to change rear projection screen, can at first carry out the photograph album menu, as shown in the embodiment of the present invention.On the contrary, after carrying out the rear projection screen menu, but the photo of search subscriber expectation is to be used for change.

Figure 19 illustrates the general survey of passing through the method for voice command multimedia rendering file in mobile terminal according to an embodiment of the invention.Be similar to above embodiment, this describes hypothesis: in case the input activation control signal, controller 180 just begins voice activated recognition function, and the input by receiving voice command or utilize the user of other input unit to handle to carry out certain menu (for example, multimedia reproduction menu).

For by user's multimedia rendering file, carry out certain menu, one of submenu of certain menu is selected with dir, and selects a file and reproduce thus from listed files.Yet, in the present invention, can reproduce thus by the multimedia file that voice command directly searches user expectation.

For example, if after speech identifying function is activated input specific voice command (for example, moving to the Beatles photograph album), whole background and the content of controller 180 by the analyzing speech order implication that judges voice command is as shown in display screen 741.Based on the information of analyzing, controller 180 is carried out specific function or menu, or by moving to the particular file folder dir, as shown in display screen 742.

When inputting voice command (for example, play this first song or play the 3rd) when select a file from listed files after, controller 180 is by the whole background of analyzing speech order and the implication of content judgement voice command.In addition, the function corresponding with the implication of voice command or menu can directly be carried out according to high discrimination or predetermined environment setup menu.

In case just receiving the user, the implication of judgement voice command, controller 180 whether confirm accurately about the implication of the voice command that judges.Here, for user's confirmation, exportable text class message or speech message 743.During the message the user confirmed when output needle, the user can reply by voice or other input unit.If from user's response, controller 180 can be by not being judged as response the function that affirmative acknowledgement (ACK) automatically performs the voice command that judges.Selected song is carried out and play to controller 18 then, as shown in display screen 744.

Therefore, in this embodiment, the file that selection will be reproduced is inputted reproduction order to selected file by voice command.Yet, when the user knows filename, can be by voice from the direct import file name of Previous Menu be used for to reproduce.

Therefore, according to each embodiment of the present invention, under the state that speech identifying function is activated, will input voice command and convert particular form to, and its background and content and the database be appointed as the territory of benchmark will be compared.In addition, the specific components that will the end value corresponding with the implication that voice command judges outputs to mobile terminal.

Mobile terminal of the present invention can be by controlling with its specific function or serve relevant menu based on the implication of background and content judgement input voice command.In addition, mobile terminal of the present invention can be by being appointed as the territory that is used for speech recognition with certain menu or serving relevant territory and improve phonetic recognization rate according to its mode of operation or operator scheme.

Equally, mobile terminal of the present invention can be by using its one or more user interfaces (UI), and even when speech identifying function was activated, selection simultaneously or execution were with specific function or serve relevant menu, so that detect user's manipulation.In addition, mobile terminal of the present invention can according to its mode of operation or operator scheme by provide help information about the input of voice command via voice command control with specific function or serve relevant menu, and no matter user's skill how.

in addition, a plurality of territories can comprise at least two territories in following territory: corresponding to the free email domain of the Email of sending and receiving on mobile terminal, corresponding to the schedule task domain that is distributed in the schedule event on mobile terminal, contact person territory corresponding to the contact person on mobile terminal, corresponding to the telephone directory domain that is stored in the telephone number on mobile terminal, map domain corresponding to the cartographic information that is provided by mobile terminal, corresponding to the photograph field that is stored in the photo on mobile terminal, message field corresponding to the message of sending and receiving on mobile terminal, multimedia domain, MMD corresponding to the multimedia function of carrying out on mobile terminal, the external equipment territory of the external equipment that can be connected to corresponding to mobile terminal, call history territory corresponding to the calling of sending and receiving on mobile terminal, and corresponding to the territory that arranges that function is set of carrying out on mobile terminal.

In addition, can or the predetermined threshold of discrimination be set by the user of mobile terminal by the manufacturer of mobile terminal.

In addition, above each embodiment can use for example computer software, hardware or its certain combination and realize in computer-readable medium.Realize for hardware, above-described embodiment can be at one or more application-specific integrated circuit (ASIC)s (ASIC), digital signal processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), processor, controller, microcontroller, microprocessor, be designed to carry out realization in other electronic unit of function described herein or the combination of its selectivity.

Realize for software, embodiment as herein described can realize by the independent software module such as program and function, and each software module realizes one or more in function as herein described and operation.Software code can pass through the software application realization with any suitable programmed language compilation, and can be stored in memory (for example, memory 160), and can be carried out by controller or processor (for example, controller 180).

In addition, mobile terminal 100 can be with various different Configurations.The example of these configurations comprises flip-shell, slide cover type, board-type, rotary-type, rotary type and combination thereof.

Those skilled in that art are appreciated that and can make various modifications and variations and not break away from the spirit or scope of the present invention the present invention.Therefore, the present invention is intended to contain all such modifications of the present invention and distortion, as long as they drop in the scope of appended claims and equivalent technique scheme thereof.

Claims

1. mobile terminal, it comprises:

Input unit, it is configured to receive input to activate the speech identifying function on described mobile terminal;

Display unit, it is configured to the submenu of display menu or this menu;

Memory, it is configured to store a plurality of territories of the database relevant with operation with the menu of described mobile terminal; And

Controller, it is configured to:

When described speech identifying function is activated, the territory that can be used as the database of voice command identification benchmark be appointed as with described display unit on shown menu or the submenu territory that is associated;

Detection is via at least one keyword of the received voice command of described input unit;

Based on described at least one keyword that arrives after testing, analyze content and the background of described voice command;

Determine the implication of described voice command by the reference information in described database of being stored in; And

The function that execution is associated corresponding to menu described voice command and described or submenu.

2. mobile terminal as claimed in claim 1, it is characterized in that, described menu comprises at least one in multimedia menu or operation, contact person menu or operation, information receiving and transmitting menu or operation, voice menus or operation, organizer menu or operation, on-screen menu or operation, Utilities Menu or operation, camera menu or operation and setup menu or operation.

3. mobile terminal as claimed in claim 1, is characterized in that, described controller is configured to also to determine that determined menu and operation are exactly corresponding to the discrimination of described voice command.

4. mobile terminal as claimed in claim 3, is characterized in that, described controller also is configured to regulate based on the number of times that described function had before been selected with correct way the discrimination of described function.

5. mobile terminal as claimed in claim 1, it is characterized in that, described input unit comprises with at least one in lower unit: the touch soft key that 1) is touched to activate described speech identifying function, 2) be pressed or handle to activate the hard button of described speech identifying function, 3) be included in the optional position that is touched to activate described speech identifying function of the touch-screen in described input unit, 4) be transfused to activate the strike note of described speech identifying function, 5) local zone radio signal or remote zone radio signal, and 6) from user's limbs information signal.

6. mobile terminal as claimed in claim 1, is characterized in that, also comprises:

The first database is configured to store voice or the pronunciation information that is used for identifying described voice command by described controller;

The second database is configured to store word, keyword or the sentence information that is used for identifying described voice command by described controller;

The 3rd database is configured to store the information relevant with the function of described mobile terminal or menu; And

The 4th database is configured to store and will be output to notify the described controller of user just attempting to determine the help information of the implication of described voice command.

7. mobile terminal as claimed in claim 1, is characterized in that, described controller also is configured to export the audio or video information that the described speech identifying function of indication is in state of activation.

8. mobile terminal as claimed in claim 1, is characterized in that, described a plurality of territories comprise at least two territories in following territory: corresponding to the free email domain of the Email of sending and receiving on described mobile terminal, corresponding to the schedule task domain that is distributed in the schedule event on described mobile terminal, contact person territory corresponding to the contact person on described mobile terminal, corresponding to the telephone directory domain that is stored in the telephone number on described mobile terminal, map domain corresponding to the cartographic information that is provided by described mobile terminal, corresponding to the photograph field that is stored in the photo on described mobile terminal, message field corresponding to the message of sending and receiving on described mobile terminal, multimedia domain, MMD corresponding to the multimedia function of carrying out on described mobile terminal, external equipment territory corresponding to the attachable external equipment of described mobile terminal, call history territory corresponding to the calling of sending and receiving on described mobile terminal, and corresponding to the territory that arranges that function is set of carrying out on described mobile terminal.

9. a method of controlling mobile terminal, is characterized in that, described method comprises:

Receive the input that is used for activating the speech identifying function on described mobile terminal;

Receive when activating the input of described speech identifying function, the territory that can be used as the database of voice command identification benchmark be appointed as with described display unit on shown menu or the submenu territory that is associated;

Detect at least one keyword of received voice command;

10. method as claimed in claim 9, it is characterized in that, described concrete menu or operation comprise at least one in multimedia menu or operation, contact person menu or operation, information receiving and transmitting menu or operation, voice menus or operation, organizer menu or operation, on-screen menu or operation, Utilities Menu or operation, camera menu or operation and setup menu or operation.

11. method as claimed in claim 9 is characterized in that, the step of described definite voice command also comprises:

The described mobile terminal of output belongs to described special domain and is confirmed as having all menus or submenu higher than the discrimination of predetermined threshold on the display unit of described mobile terminal.

12. method as claimed in claim 11 is characterized in that, also comprises:

Receive the phonetic entry order that is used for selecting one of described all menus or submenu;

Identify described input voice command; And

Accurately whether the relevant input voice command of identifying of output inquiry information.

13. method as claimed in claim 11 is characterized in that, also comprises:

On described display unit, the described mobile terminal of output belong to the special domain of accessing, and with higher discrimination to the order of low discrimination with voice command and described all menus or the submenu that are complementary higher than the discrimination of predetermined threshold.

14. method as claimed in claim 11 is characterized in that, described predetermined threshold arranges by the manufacturer of mobile terminal or by the user of described mobile terminal.

15. method as claimed in claim 11 is characterized in that, also comprises:

Size, position, color, brightness by controlling described menu or operation and highlight at least one, show to have certain menu or the submenu of high discrimination in described all menus or submenu on described display unit distinguishablely.

16. method as claimed in claim 9 is characterized in that, also comprises:

Regulate the discrimination of described function based on the number of times that described function was before once selected with correct way.

17. method as claimed in claim 9 is characterized in that, also comprises:

The described speech identifying function of output indication is in the audio or video information of state of activation.

18. method as claimed in claim 9 is characterized in that, described a plurality of territories comprise at least two territories in following territory: corresponding to the free email domain of the Email of sending and receiving on described mobile terminal, corresponding to the schedule task domain that is distributed in the schedule event on described mobile terminal, contact person territory corresponding to the contact person on described mobile terminal, corresponding to the telephone directory domain that is stored in the telephone number on described mobile terminal, map domain corresponding to the cartographic information that is provided by described mobile terminal, corresponding to the photograph field that is stored in the photo on described mobile terminal, message field corresponding to the message of sending and receiving on described mobile terminal, multimedia domain, MMD corresponding to the multimedia function of carrying out on described mobile terminal, external equipment territory corresponding to the attachable external equipment of described mobile terminal, call history territory corresponding to the calling of sending and receiving on described mobile terminal, and corresponding to the territory that arranges that function is set of carrying out on described mobile terminal.