CN103811007B

CN103811007B - Display device, voice acquisition device and its audio recognition method

Info

Publication number: CN103811007B
Application number: CN201310553280.4A
Authority: CN
Inventors: 蒋种赫; 崔赞熙; 柳熙涉; 朴劲美; 朴胜权; 裵在铉
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2012-11-09
Filing date: 2013-11-08
Publication date: 2019-12-03
Anticipated expiration: 2033-11-08
Also published as: EP3790285A1; US20200184989A1; US20220358949A1; CN103811007A; RU2015121906A; JP5868927B2; US20230121055A1; JP6640502B2; EP4106339A1; US10043537B2; JP2014096153A; CN104883587A; WO2014073823A1; RU2677396C2; US20170337937A1; EP2731349A1; EP2731349B1; JP2016027484A; US11727951B2; KR20140060040A

Abstract

Display device, voice acquisition device and its audio recognition method are disclosed, the display device includes: the display unit for showing image；With the communication unit of multiple communication with external apparatus；And controller comprising the speech recognition engine for identifying user speech receives voice signal from voice acquisition unit, and controls the communication unit and receive candidate instruction word from least one of the multiple external device (ED) to identify the voice signal received.

Description

Display device, voice acquisition device and its audio recognition method

Technical field

It is related to display device, voice acquisition device and its speech recognition side with the consistent device and method of exemplary embodiment Method, more particularly, to display device, voice acquisition device and its audio recognition method of identification user speech.

Background technique

Speech identifying function is used in such as DTV (TV), air-conditioning, home theater, personal computer (PC) and moves In the various electronic devices of mobile phone etc..

In order to execute speech identifying function, for example the master device of TV should have the microphone for receiving user speech and identification The speech recognition engine of voice is inputted, and speech recognition engine can will input the candidate instruction word of voice and storage (instruction words) is compared, and identifies voice according to comparison result.

However, the relevant technologies electronic device with speech identifying function has the fixation device for receiving user speech, because The various input units of input voice, such as mobile phone are difficult to be utilized in this.In addition, if many candidate instruction words are provided, Discrimination will be then improved, but electronic device will compare candidate instruction word, this causes voice recognition processing speed slower.This Outside, because the memory capacity of master device is limited, the quantity of candidate instruction word cannot be constantly increasing.

Summary of the invention

Aspect accoding to exemplary embodiment provides a kind of display device comprising: display unit is shown on it Image；Communication unit, with multiple communication with external apparatus；And controller comprising identify that the speech recognition of user speech is drawn It holds up, receives voice signal from voice acquisition unit, and control the communication unit from the multiple external device (ED) at least One reception candidate instruction word is to identify the voice signal received.

Multiple voice acquisition units can be provided.If detected at least one of the multiple voice acquisition unit Voice input, then the controller receives voice signal from the voice acquisition unit for detecting voice input to it.

The voice acquisition unit may include at least one of the following: in providing in the display device Microphone, the first external microphone provided at least one of multiple external device (ED)s and with built-in microphone and The second different external microphone of first external microphone.

The external device (ED) may include at least one application that can manage candidate instruction word.

The display device can also include the native applications (native application) of management candidate instruction word.

The display device can also include storage unit, wherein store received candidate instruction word, and The speech recognition engine can identify the voice received by using the candidate instruction word of storage.

If at least one of multiple voice acquisition units detect that wake-up keyword, the controller can enable It detects the voice acquisition unit for waking up keyword, and receives voice signal from the voice acquisition unit enabled.

If having input trigger signal by manipulating the predetermined button provided in one in multiple voice acquisition units, Then the controller can enable the voice acquisition unit that trigger signal is had input by it, and obtain from the voice enabled Unit receives voice signal.

The controller can control the display unit and show speech recognition knot for the voice signal on it Fruit and candidate instruction word corresponding with speech recognition result.

The display unit can show the information of the application about management candidate instruction word on it.

The speech recognition engine can be believed by the voice for determining and receiving in the candidate instruction word received Number same or similar coding line identifies voice.

Aspect according to another exemplary embodiment provides a kind of voice acquisition device comprising: communication unit, It is communicated with the display device with speech identifying function；Voice acquisition unit receives user speech；Speech convertor, will The voice received is converted into electric voice signal；And controller, it controls the communication unit and is sent to the display device The voice signal and candidate instruction word converted are to identify the voice signal.

The voice acquisition device can also include at least one application that can manage candidate instruction word.

Aspect according to another exemplary embodiment provides a kind of audio recognition method of display device comprising: from Voice acquisition unit receives voice signal；Candidate instruction word is received from least one of multiple external device (ED)s to be received to identify The voice signal arrived；And voice signal and candidate instruction word identify the voice of user based on the received.

The audio recognition method can also include: the language that detection is input at least one of multiple voice acquisition units Sound, and receiving voice signal may include: from the voice acquisition unit reception voice signal for detecting that voice is inputted to it.

The voice acquisition unit may include at least one of the following: in providing in the display device Microphone, the first external microphone provided at least one of multiple external device (ED)s and with the display fill The second external microphone provided in the device different with the multiple external device (ED) is provided.

The external device (ED) may include at least one application for managing candidate instruction word.

The display device can also include the native applications of management candidate instruction word.

The audio recognition method can also include the received candidate instruction word of storage, and identify that voice can wrap It includes and voice is identified by using the candidate instruction word of storage.

Detection voice input may include: one wake-up keyword that detection is input in multiple voice acquisition units, And it enables and detects the voice acquisition unit for waking up keyword.

Detection voice input may include: according to the predetermined button provided in one in multiple voice acquisition units The input of manipulation detection trigger signal, and enable and pass through its voice acquisition unit for having input the trigger signal.

The audio recognition method can also include: speech recognition result and and language of the display for the voice signal The corresponding candidate instruction word of sound recognition result.

The display may include: the information of application of the display about management candidate instruction word.

Identification voice may include: the voice signal phase by determining with receiving in the candidate instruction word received With or similar coding line identify voice.

Detailed description of the invention

From referring to the drawings to the description of exemplary embodiment, above and/or other aspect be will be apparent and more It is readily appreciated that, in attached drawing:

Fig. 1 illustrates the examples of speech recognition system accoding to exemplary embodiment；

Fig. 2 is the block diagram of speech recognition system accoding to exemplary embodiment；

Fig. 3 illustrates the example of the execution of speech recognition accoding to exemplary embodiment；

Fig. 4 illustrates the example as the screen as the result is shown of speech recognition in Fig. 3；

Fig. 5 illustrates the example of the execution of speech recognition according to another exemplary embodiment；

Fig. 6 is the flow chart for showing the audio recognition method of speech recognition system accoding to exemplary embodiment；

Fig. 7 is the flow chart for showing the details for the process that voice input is detected in Fig. 6；And

Fig. 8 is the flow chart for showing the details for the process that speech recognition is executed in Fig. 6.

Specific embodiment

Exemplary embodiment is described in detail next, with reference to attached drawing.Exemplary embodiment can be come specific in a variety of manners It realizes, and is not limited to the exemplary embodiment illustrated here.For the sake of clarity, the description to well-known components is omitted, and And identical reference marker refers to identical element always.

Fig. 1 illustrates the examples of speech recognition system accoding to exemplary embodiment.

As shown in fig. 1, speech recognition system includes master device 100, multiple voice acquisition device 201 and 202, Yi Jiduo A external device (ED) 301,302 and 303.Master device 100, multiple voice acquisition device 201 and 202 and multiple external device (ED)s 301, it 302 and 303 is connected with each other to be in communication with each other.

Master device 100 includes voice acquisition unit 140 for receiving the such as microphone of user speech and for knowing Voice and communication unit 160 and multiple voice acquisition device 201 and 202 and multiple external device (ED)s 301,302 Shu Ru not be passed through With the speech recognition engine 181 of 303 communications.Master device 100 further includes being actuated to that master device 100 is made to perform various functions (clothes Business) primary (native) apply 171 and 172.Native applications 171 and 172 store corresponding with the function wherein in advance Candidate instruction word.That is, native applications 171 and 172 are included in available services scenarios (available service Scenario in).The candidate instruction word stored in native applications 171 and 172 is sent to speech recognition in speech recognition Engine 181 is so that speech recognition engine 181 is able to carry out speech recognition.

Each of multiple voice acquisition device 201 and 202 may include the such as Mike for receiving user speech The voice acquisition unit of wind, and voice signal corresponding with the voice received is sent to master device 100 for language Sound identification.

Multiple voice acquisition device 201 and 202 can receive the voice of user, and the voice is converted to electricity voice letter Number, and master device 100 is sent by the electric voice signal.Multiple voice acquisition device 201 and 202 can execute and main dress Set 100 wireless communication.Although wireless communication includes Wireless LAN, radio frequency (RF) communication, bluetooth, purple honeybee, infrared ray (IR) communication Etc., but it is not limited to this.

Multiple external device (ED)s 301,302 and 303 can according to need including for execute function (service) at least one Development and application (dev.Application).Storage is executed with by external device (ED) 301,302 and 303 wherein in advance for development and application The corresponding candidate instruction word of function.The candidate instruction word command stored in development and application is sent in speech recognition Speech recognition engine 181 is so that speech recognition engine 181 is able to carry out speech recognition.

Pre-stored candidate instruction word can be and apply in native applications 171 and 172 and in development and application The relevant coding line of function/operation.For example, if master device 100 is TV, channel change, volume adjustment with TV etc. Relevant candidate instruction word can be stored in one in native applications 171 and 172.If external device (ED) 302 is air-conditioning, It then can to the relevant candidate instruction word of the temperature of air-conditioning adjustment (increase/decrease), the intensity adjustment (strong/weak/medium) of wind etc. To be stored in application included in external device (ED) 302.

External device (ED) or voice acquisition device may include both voice acquisition unit and development and application.In such case Under, if voice is input into the voice acquisition unit in the first external device (ED) 301, it is stored in advance in the first external device (ED) Candidate instruction word in 301 development and application is sent to the speech recognition engine 181 of master device 100 to execute speech recognition.

Speech recognition system accoding to exemplary embodiment includes at least one voice acquisition unit.If detecting input To the voice of the voice acquisition unit, then speech recognition system has been detected by the voice acquisition of voice input by enabling to it Unit receives voice flow.Provided that multiple voice acquisition units, then speech recognition system can be described more by enabling The voice acquisition unit that voice inputs is had been detected by it to receive voice flow in a voice acquisition unit.Multiple voices Acquiring unit may include the built-in microphone provided in master device 100, in multiple external device (ED)s 301,302 and 303 The first external microphone for being there is provided at least one and with master device 100 and multiple external device (ED)s 301,302 and 303 not The second external microphone provided in same voice acquisition device 201 and 202.Voice acquisition device 201 and 202 and master device 100 and multiple external device (ED)s 301,302 and 303 separate.

If at least one of the multiple voice acquisition unit detects wake-up keyword, master device 100 can be with It enables and the voice acquisition unit of the wake-up keyword is detected by it, and receive voice from the voice acquisition unit enabled Signal.If passing through manipulation predetermined button (for example, event generation) at least one of the multiple voice acquisition unit Trigger signal is had input, then master device 100, which can be enabled, has input the voice acquisition unit of the input trigger signal simultaneously by it And voice signal is received from the voice acquisition unit enabled.

Master device 100 can be operated with speech recognition mode.If enabled by waking up keyword or trigger signal At least one voice acquisition unit, then master device 100 can disable other voice acquisition units to prevent speech recognition mistake Accidentally.Master device 100 can be with long-range or process speech recognition mode (distant or adjacent voice Recognition mode) operation.In order to which user is convenient, master device 100, which can be shown, is shown connected to display unit 130(slightly Afterwards will description) voice acquisition unit user interface (UI).

Master device 100 can receive candidate instruction word from least one of multiple external device (ED)s 301,302 and 303 to know The voice signal not received.The candidate instruction word received can be sent to speech recognition engine 181 to know for voice Not.

Multiple external device (ED)s 301,302 and 303 include at least one application of management candidate instruction word.Master device 100 is wrapped Include the native applications 171 and 172 of management candidate instruction word.It can be sent out by the candidate instruction word that native applications 171 and 172 manage Speech recognition engine 181 is sent to for speech recognition.

Master device 100 may be implemented as such as the display device in Fig. 2, such as TV (TV).

Fig. 2 is the block diagram of speech recognition system accoding to exemplary embodiment.

Display device 100 handles the picture signal from external image source of supply (not shown) to be based on processed image Signal shows image.

In speech recognition system accoding to exemplary embodiment, display device 100 is implemented as being based on sending out from broadcasting station Broadcast singal/broadcast message/the broadcast data penetrated handles the TV or set-top box of broadcast image.It is appreciated, however, that one In a or a number of other exemplary embodiments, other than TV or set-top box, display device 100 can be adapted for handling and show The various other equipment of diagram picture.For example, display device 100 may include personal computer (PC), portable computer etc..

Furthermore, it is to be understood that the type for the image that can be shown by display device 100 is not limited to broadcast image.For example, aobvious Showing device 100 can be shown based on the signal/data sent by various image source of supply (not shown) for example, video, static Image, application, screen display (OSD), the graphical user interface (GUI) for controlling various operations.

Accoding to exemplary embodiment, display device 100 may be implemented as intelligent TV.Intelligent TV can with real-time reception and Show broadcast singal, with real-time display broadcast singal and by the web browser function of the various contents of internet hunt, and And provide convenient user environment for doing above-mentioned item.Intelligent TV may include the opening for providing the user with interactive service Software platform, and various contents can be provided the user with by the software platform of the opening, for example, providing answering for reservation service With.The application can provide various types of services, for example, SNS, finance, news, weather, map, music, film, trip Play, e-book etc..

Display device 100 includes the speech recognition engine 181 of user speech for identification.It is corresponding with the voice identified Order (such as control command) be sent to respective application to execute operation.If application corresponding with control command is former Raw one applied in 171 and 172, then display device 100 executes operation by the application according to control command.If with control It is development and application that system, which orders corresponding application, then control command is sent to the external device (ED) 301,302 including development and application With 303.External device (ED) 301,302 and 303 can execute operation by the application according to control command.

Referring to Fig. 2, multiple voice acquisition device, such as mobile phone 200 and remote controler 300 are provided.Remote controler 300 can To serve as both voice acquisition device and external device (ED).Mobile phone 200 can be the smart phone that function is obtained with voice.

Remote controler 300 can be manipulated by user pre-set commands (control command) being sent to related device.Remote controler 300 It can be set to send a command to display device 100 or external device (ED), and may be implemented as sending a command to The comprehensive remote controller (integrated remote controller) of multiple devices.Remote controler 300 may include TV remote controler And/or air-conditioning remote control.

Voice acquisition device may be implemented as receiving the various devices of user speech, for example, cell phone, microphone are sent out Emitter etc..

As shown in the figure 2, provide multiple external device (ED)s, such as remote controler 300 and air-conditioning 400.As described above, remote controler 300 can serve as both voice acquisition device and external device (ED).

Although Fig. 2 illustrates the external device (ED) as remote controler 300 and air-conditioning 400, exemplary embodiment is not limited to In this.For example, external device (ED) may be implemented as executing various other electronic equipments of wireless communication, for example, being implemented as house Front yard movie theatre, wireless device, VCR, DVD, washing machine, refrigerator, robotic vacuum cleaner etc..If the external device (ED) Voice acquisition unit including such as microphone, then external device (ED) is also used as voice acquisition device.

External device (ED) accoding to exemplary embodiment includes the application 372 and 472 for executing function respectively.Using 372 and 472 Candidate instruction word is stored in advance, and manages the candidate instruction word.The candidate instruction word can be sent to display device 100 to be used for speech recognition.

External device (ED), that is, remote controler 300 and air-conditioning 400 can execute and by display device 100 according to the result of speech recognition The corresponding operation of the control command of transmission.

Hereinafter, referring to Fig. 2, it will be described in each element of speech recognition system.

Display device 100 may include: the picture receiver 110 for receiving picture signal；Image processor 120, processing The picture signal received from picture receiver 110；Display unit 130, based on the image letter handled by image processor 120 Number display image；Receive the first voice acquisition unit 140 of user speech；First speech convertor 150, the language that will be received Sound is converted into electric voice signal；With the first communication unit 160 of communication with external apparatus；Store the first storage unit of various data 170；And the first controller 180 of control display device 100.

Picture receiver 110 receives picture signal and sends image processor 120 for described image signal.For example, figure As receiver 110 can wirelessly receive rf signal from broadcasting station (not shown), or according to such as synthesizing Video, component video, super video, SCART(radio and television receiver Manufacturers Association), high-definition multimedia interface (HDMI) etc. standard receives picture signal in a wired manner.If described image signal includes broadcast singal, image Receiver 110 includes the tuner by channel tuner broadcast singal.

Picture signal can be received from external device (ED), the external device (ED) is for example, PC, AV equipment, smart phone, intelligence are flat Plate etc..Picture signal can be the data sent by the network of such as internet.In this case, display device 100 Network communication can be executed by the first communication unit 160, and may include additional network communication unit.Alternatively, Picture signal can be stored in the first storage unit 170(for example, flash memory, hard disk drive (HDD) etc.) in Data.First storage unit 170 can be provided in the inner/outer of display device 100.If the first storage unit 170 exists The external of display device 100 provides, then display device 100 may include that the first storage unit 170 is connected to its connector (not It shows).

Image processor 120 executes various image processing operations for described image signal, and by processed image Signal is output to display unit 130.

The image processing operations of image processor 120 may include decoding operate corresponding with various picture formats, go De-interlacing operation, frame updating rate conversion, zoom operations, the noise reduction operation for improving picture quality, details enhancing operation, row are swept Operation etc. is retouched, but it is not limited to this.Image processor 120 may be implemented as independently executing the individual of aforesaid operations Group, or be implemented as executing the system on chip (SoC) of comprehensive function.

Display unit 130 shows image based on the picture signal handled by image processor 120.Display unit 130 can To include liquid crystal display (LCD), plasma display panel (PDP), light emitting diode (LED), Organic Light Emitting Diode (OLED), surface-conduction electron emission device, carbon nanotube, nanocrystal etc., but it is not limited to this.

Display unit 130 can include additional element according to its implementation type.For example, the display list as LCD type Member 130 includes LCD panel (not shown), the back light unit (not shown) for emitting light into LCD panel and driving LCD panel Panel driving substrate (not shown).

Display unit 130 can show speech recognition result as the information about the voice identified.Speech recognition knot Fruit can be shown with the various forms of such as text, figure, icon etc..Text includes character and number.Display unit 130 Candidate instruction word can also be shown according to speech recognition result and application message.Later with reference to Fig. 4 in more detail to this into Row description.

User can be checked whether based on the speech recognition result shown on display unit 130 correctly identifies language Sound.User can manipulate the user input unit 330 in remote controler 300 to select and user's language from shown candidate instruction word The corresponding coding line of sound, or can choose and check information relevant to speech recognition result.

First voice acquisition unit 140 receives the voice of user, and may be implemented as microphone.

The voice inputted by the first voice acquisition unit 140 is converted into electric voice signal by the first speech convertor 150.Through The voice signal of conversion can be using the audio volume control of pulse code modulation (pcm) or compression.First speech convertor 150 can To be implemented as the A/D converter that the voice by user is converted into digital form.

If the first voice acquisition unit 140 is digital microphone, it does not need additional A/D conversion.In this feelings Under condition, the first voice acquisition unit 140 may include the first speech convertor 150.

First communication unit 160 and voice acquisition device and communication with external apparatus, that is, with mobile phone 200, remote controler 300 and air-conditioning 400 communicate.First communication unit 160 can execute including in infrared communication, RF, purple honeybee and bluetooth extremely Few one wireless communication.

First storage unit 170 is by the control of the first controller 180 come storing data.First storage unit 170 is by reality It is now non-volatile memory medium, such as flash memory, hard disk drive (HDD) etc..First storage unit 170 is read / write-in/is taken to change/delete/the first controller 180 of more new data access.

The data being stored in the first storage unit 170 include for example for driving the operating system of display device 100 (OS), various applications, image data and additional data for being run on OS etc..

First storage unit 170 can store the various data of user speech for identification.For example, the first storage unit 170 can store the coding line table 171(including candidate instruction word hereinafter, also referred to as candidate instruction word group), as with The corresponding voice messaging identified of the voice signal received.In coding line table 171, candidate instruction word can be by phase It should apply to manage.

First storage unit 170 can also store at least one application, for example, the first application 172 and the second application 173 with Execute the function of display device 100.First application 172 and the second application 173 will be described later by the first controller 180() Control executes the various functions of display device 100 to drive.Although Fig. 2, which is illustrated, is wherein mounted with that two are applied 172 Hes 173 display device 100, but exemplary embodiment is not limited to this.I.e., it is possible in display device 100 install three or More applications.

First application 172 and the second application 173 can manage candidate instruction word corresponding with performed function.By The candidate instruction word of 173 management of one application 172 and the second application can be registered to coding line table 171/ from coding line table 171 are deleted.

If candidate instruction word is registered to coding line table 171, speech recognition engine 181 is by using coding line table Candidate instruction word in lattice 171 executes speech recognition.

Can be registered to coding line table 171/ from coding line table 171 be deleted candidate instruction word may include by Remote controler 300(later will description) third using 372 management candidate instruction words and by air-conditioning 400 the 4th apply 472 The candidate instruction word of management.

The various elements of first controller 180 control display device 100.For example, the first controller 180 controls image procossing Device 120 handles picture signal, and in response to the order executive control operation from remote controler 300 to control display device 100 Overall operation.

For example, the first controller 180 may be implemented as the central processing unit (CPU) in conjunction with software.

First controller 180 may include the speech recognition engine 181 for identifying user speech.Speech recognition engine 181 Speech identifying function can be executed by using known speech recognition algorithm.For example, speech recognition engine 181 extract it is described The speech characteristic vector of voice signal, and by extracted speech characteristic vector and the finger for being stored in the first storage unit 170 The candidate instruction word in word table 171 is enabled to be compared to identify voice.If not identical with the speech characteristic vector The candidate instruction word being stored in coding line table 171, then speech recognition engine 181 can be by utilizing most like coding line Speech recognition result is adjusted to identify the voice.If there is multiple similar candidate instruction words, then the first controller 180 can To show multiple candidate instruction words on display unit 130, one in the multiple candidate instruction word for selection by the user.

Speech recognition engine 181 accoding to exemplary embodiment is implemented as the Embedded Speech Recognition System provided in CPU Engine 181, however it is without being limited thereto.For example, speech recognition engine 181 can be implemented as being provided separately from CPU in display device Device in 100, that is, be implemented as the additional chips of such as microcomputer.

However it is without being limited thereto, exemplary embodiment includes (hereinafter, will be by the server isolated with display device 100 Referred to as Cloud Server (not shown)) in provide speech recognition engine 181.Cloud Server passes through the network of such as internet and shows Showing device 100 communicates.The network can be cable network or wireless network.In this case, speech recognition engine 181 The Embedded Speech Recognition System engine provided in the CPU of Cloud Server is provided, or is implemented as separating with the CPU Ground provides the device in Cloud Server, that is, the additional chips of such as microcomputer.

First controller 180 can execute operation corresponding with the speech recognition result of speech recognition engine 181.Example Such as, if display device 100 is TV and user is watching film or news, speech recognition engine 181 can be identified Such as the voice of " volume rising ", " volume decline ", " more loud ", " smaller sound " etc., and the first controller 180 can root The volume of the film or news is adjusted according to the voice.

If the identification of speech recognition engine 181 is for controlling the language of the such as external device (ED) of remote controler 300 or air-conditioning 400 Sound, then the first controller 180 can control the first communication unit 160 control command is sent to it is corresponding with the voice identified External device (ED).For example, if speech recognition engine 181 identifies the voice of " increase temperature ", the first controller 180 can be with Identify the voice be for controlling air-conditioning 400, and control the first communication unit 160 send commands to air-conditioning 400 with Increase the temperature of air-conditioning 400.

Hereinafter, the detailed configuration of mobile phone 200 will be described.

As shown in Figure 2, mobile phone 200 may include the second voice acquisition unit 240 for receiving user speech, will connect The voice received be converted into electric voice signal the second speech convertor 250, with the second communication unit 260 of PERCOM peripheral communication, deposit It stores up the second storage unit 270 of data and controls the second controller 280 of mobile phone 200.

The second voice acquisition unit 240 for receiving user speech may be implemented as microphone.Second speech convertor 250 The voice received is converted into electric voice signal.Converted voice signal can using pulse code modulation (pcm) or The audio volume control of compression.Second speech convertor 250 may be implemented as the input voice of user being converted into digital form A/D converter.

If the second voice acquisition unit 240 is digital microphone, it does not need additional A/D conversion.In this feelings Under condition, the second voice acquisition unit 240 may include the second speech convertor 250.

The second communication unit 260 communicated with display device 100 can execute wire communication or wireless communication.The nothing Line communication may include at least one of RF, purple honeybee and bluetooth.

Second communication unit 260 can send display device 100 from the second speech convertor 250 for voice signal.

Second storage unit 270 can be by the control of second controller 280 come storing data.Second storage unit 270 It is implemented as the non-volatile memory medium of such as flash memory.Second storage unit 270 is read, and/write-in/is changed/is deleted Except the second controller 280 of/more new data accesses.

The data being stored in the second storage unit 270 may include for example for driving the OS of mobile phone 200, in OS Various applications, image data and additional data of upper operation etc..

Second controller 280 can control the various elements of mobile phone 200.For example, second controller 280 can respond Order is generated in user's manipulation, executes operation corresponding with the order of generation, and show on a display unit (not shown) Show result.

Second controller 280 may be implemented as the micro controller unit (MCU) in conjunction with software.

If having input user speech by the second voice acquisition unit 240, second controller 280 controls the second voice The voice of user is converted into electric voice signal and controls the second communication unit 260 for converted voice letter by converter 250 Number it is sent to display device 100.

Hereinafter, the detailed configuration of remote controler 300 will be described.

As shown in Figure 2, the remote controler 300 as voice acquisition device and external device (ED) may include: to receive user The user input unit 330 of manipulation；Receive the third voice acquisition unit 340 of user speech；Third speech convertor 350, The voice received is converted into electric voice signal；With the third communication unit 360 of PERCOM peripheral communication；The third of storing data stores Unit 370；And the third controller 380 of control remote controler 300.

User input unit 330 can send for various control commands or information by the manipulation of user and input Three controllers 380.User input unit 330 may be implemented as the Menu key provided in remote controler 300, number key etc.. If remote controler 300 is TV remote controler, user input unit 330 may include receiving the touching sensing of the touching input of user The action sensor of the movement of device (touch sensor), and/or sensing remote controler 300.

The third voice acquisition unit 340 for receiving user speech may be implemented as microphone.

The voice inputted by third voice acquisition unit 340 is converted into electric voice signal by third speech convertor 350.Through The voice signal of conversion can be using the audio volume control of pulse code modulation (pcm) or compression.Third speech convertor 350 can To be implemented as the input voice of user being converted into the A/D converter of digital form.

If third voice acquisition unit 340 is digital microphone, it does not need additional A/D conversion.In this feelings Under condition, third voice acquisition unit 340 may include third speech convertor 350.

Third communication unit 360 is communicated with display device 100.Third communication unit 360 executes wireless communication.It is described wireless Communication includes at least one of RF, purple honeybee and bluetooth.

Third communication unit 360 to display device 100 send voice signal from third speech convertor 350 and by Third storage unit 370(later will description) third apply 372 management candidate instruction words.

It may be implemented as such as fastly by the control of third controller 380 come the third storage unit 370 of storing data The non-volatile memory medium of flash memory etc..Third storage unit 370 is read, and/more new data is changed/deleted to/write-in/ Third controller 380 access.

The data being stored in third storage unit 370 include for example for driving the OS of remote controler 300, running on OS Various applications, image data and additional data etc..

Third storage unit 370 can also store at least one application, for example, the function for executing remote controler 300 Third applies 372.Third will be described using 372 by third controller 380(later) control drive, and execute remote control The various functions of device 300.Here, third will be described using 372 and the 4th using 472(later) development and application will be referred to as, with Just it is distinguished with the native applications of display device 100 172 and 173.

Although Fig. 2 illustrates the remote controler 300 for being wherein mounted with an application 372, exemplary embodiment is not limited to In this.I.e., it is possible to install two or more applications in remote controler 300.

Third can manage candidate instruction word corresponding with the function of executing using 372.By third using 372 management Candidate instruction word can be registered in the coding line table 171 of display device 100/from the coding line table of display device 100 It is deleted in 171.

Third controller 380 can control the various elements of remote controler 300.For example, third controller 380 can be in response to The user of user input unit 330, which manipulates, generates order, and control third communication unit 360 order of generation is sent to it is aobvious Showing device 100.

Third controller 380 may be implemented as the MCU in conjunction with software.

If having input user speech by third voice acquisition unit 340, third controller 380 controls third voice The voice of user is converted into electric voice signal and controls third communication unit 360 for converted voice letter by converter 350 Number it is sent to display device 100.

When communicating with display device 100, third controller 380 can send to display device 100 and store list by third The third of member 370 applies the candidate instruction word of 372 management.The candidate instruction word of transmission is registered in the instruction of display device 100 It is used to identify voice in word table 171 and by speech recognition engine 181.

If third controller 380 can be with as speech recognition as a result, sending control command by display device 100 The control command is received by third communication unit 360 and executes operation corresponding with the control command received.

Hereinafter, the detailed configuration of air-conditioning 400 will be described.

As shown in Figure 2, as the air-conditioning of external device (ED) 400 may include with the fourth communication unit 460 of PERCOM peripheral communication, 4th storage unit 470 of storing data and the 4th controller 480 for controlling air-conditioning 400.

The fourth communication unit 460 communicated with display device 100 can execute including in RF, purple honeybee and bluetooth at least One wireless communication.

Fourth communication unit 460 is sent to display device 100 later will description by the 4th storage unit 470() the 4th answer With the candidate instruction word of 472 management.

It may be implemented as such as quick flashing by the 4th storage unit 470 of the control storing data of the 4th controller 480 The non-volatile memory medium of memory.4th storage unit 470, which is read ,/write-in/changes/deletes/more new data the 4th Controller 480 accesses.

The data being stored in the 4th storage unit 470 include for example for driving the OS of air-conditioning 400, running on OS Various applications, image data and additional data etc..

4th storage unit 470 can also store at least one and apply (development and application), for example, for executing air-conditioning 400 Function the 4th apply 472.4th application 472 will be described later by the 4th controller 480() control drive, and Execute the various functions of air-conditioning 400.

Although Fig. 2 illustrates the air-conditioning 400 for being wherein mounted with an application 472, exemplary embodiment is not limited to This.I.e., it is possible to install two or more applications in air-conditioning 400.

4th application 372 can manage candidate instruction word corresponding with the function of executing.By 472 management of the 4th application Candidate instruction word can be registered in the coding line table 171 of display device 100/from the coding line table of display device 100 It is deleted in 171.

The various elements of 4th controller 480 control air-conditioning 400.For example, the 4th controller 480 can be in response to air-conditioning The user of 400 remote controler manipulates to receive control command, and according to the control command executive control operation of generation, for example, Adjust temperature.

4th controller 480 may be implemented as the MCU in conjunction with software.

When communicating with display device 100, the 4th controller 480 can send single by the 4th storage to display device 100 The candidate instruction word of 472 management of the 4th application of member 470.The candidate instruction word of transmission is registered in the instruction of display device 100 It is used to identify voice in word table 171 and by speech recognition engine 181.

If the 4th controller 480 can be with as speech recognition as a result, sending control command by display device 100 The control command is received by fourth communication unit 460 and executes operation corresponding with the control command received.

If detecting that voice inputs at least one of multiple voice acquisition units 140,240 and 340, it is used as root According to the first communication of the first controller 180 control of the display device 100 of the master device of the speech recognition system of exemplary embodiment Unit 140 receives voice signal from the voice acquisition unit that voice inputs is had been detected by it.First controller 180 passes through the One communication unit 140 is filled from least one of development and application 372 and 472 of multiple external device (ED)s 300 and 400 or from display The native applications 172 and 173 for setting 100 receive candidate instruction word to identify the voice signal received, and the candidate of transmission is referred to Word is enabled to be registered in the coding line table 171 of the first storage unit 170.Speech recognition engine 181 will be registered in coding line table Candidate instruction word in 171 is compared with the voice signal and identifies the voice.

Display device 100 can detecte the voice by the various devices input for inputting user speech.Display device 100 can Voice is identified to use the candidate instruction word provided by application, and can dynamically be registered/be deleted for speech recognition Candidate instruction word.Accordingly it is possible to prevent unnecessarily increasing the candidate instruction word of display device 100.

Display device 100 can receive speech recognition from voice acquisition unit 140, from native applications 172 and 173 to Lack one or receive candidate instruction word from development and application 372 and 472, and executes voice using speech recognition engine 181 Identification.

Hereinafter, the speech recognition of identification voice accoding to exemplary embodiment will be more fully described referring to Fig. 3 and Fig. 4 System.

Fig. 3 illustrates the example for executing speech recognition, and Fig. 4 illustrates the screen of the speech recognition result in display Fig. 3 The example of curtain.

As shown in Figure 3, display device 100 can have the candidate instruction word of registration, by least one application (including Native applications and development and application) it provides and is stored in coding line table 171.

For example, coding line A and B is by the first application 172(that is, native applications) it is sent to coding line table 171(501), and And (502) are stored in coding line table 171.Speech recognition engine 181 will be stored in the coding line in coding line table 171 A and B is registered as candidate instruction word (504).

Coding line C and D is by third application 372(that is, development and application) it is sent to coding line table 171(505), and deposited Storage is in coding line table 171 (506).Speech recognition engine 181 infuses the coding line C and D that are stored in coding line table 171 Volume is candidate instruction word (508).

Therefore, speech recognition engine 181 by by first application 172 and third using 372 send coding line A, B, C and D is registered as candidate instruction word.

For example, when coding line A, B, C and D are registered as candidate instruction word, it can detecte and be input to and display device The voice A of second voice acquisition unit 240 of 100 separation.The voice A detected converts Chinese idiom by the second speech convertor 250 Sound signal, and speech recognition engine 181(509 is sent to by the second communication unit 260 and the first communication unit 160).

The voice signal of voice A is compared by speech recognition engine 181 with candidate instruction word A, B, C and D of registration, It determines identical or similar order, and identifies voice A(510).

First controller 180 can send display unit 130(511 for recognition result), and display unit 130 can be with Speech recognition result is shown as in figure 4.

As shown in Figure 4, display unit 130 can show the UI for showing speech recognition result " A " 60 and according to voice Candidate instruction word A61, B62, C63 and D64 of recognition result.Display unit 130 can also be shown according to speech recognition result The UI of the application message (the first application) 65 of management coding line A is shown.

By the UI shown on the screen, user can check speech recognition result and candidate instruction word.If voice Recognition result does not meet his/her intention of speaking, then user can choose one in the candidate instruction word.User can be with The information of device relevant to speech recognition result is obtained by application message.

First controller 180 sends first using 172 for control command according to the speech recognition result in such as Fig. 3 (512).The voice A executive control operation that first application 172 is identified by the control of the first controller 180, basis.For example, If voice A is " volume reduction ", the volume of display device 100 is reduced.

As described in Fig. 3 and Fig. 4, display device 100 can with the coding lines of some applications registered in advance (for example, A, B, C with And D), and if detecting the voice of user, display device 100 can identify the voice, show speech recognition result, And the candidate instruction word based on registration executes corresponding control operation.

Although Fig. 3 and Fig. 4 illustrate the first application 172 for being registered as candidate instruction word and third using 372 instruction Word, and by the voice of the second voice acquisition unit 240 input user, but exemplary embodiment is not limited to this.For example, Coding line can be sent by various other native applications and development and application to register/delete candidate instruction word, and by each Kind voice acquisition unit inputs voice.

Hereinafter, the voice by the execution speech recognition referring to Fig. 5 detailed description according to another exemplary embodiment is known Other system.

Fig. 5 illustrates the example of execution speech recognition according to another exemplary embodiment.

As shown in Figure 5, it can detecte the language for being input to the third voice acquisition unit 340 isolated with display device 100 Sound E.The voice E detected is converted into voice signal by third speech convertor 350, and passes through 360 He of third communication unit First communication unit 160 is sent to speech recognition engine 181(701).

Display device 100 can have the candidate instruction word of registration.For example, coding line E and F is sent by third using 372 To coding line table 171(702), and (703) are stored in coding line table 171.Speech recognition engine 181 will be stored in Coding line E and F in coding line table 171 are registered as candidate instruction word (705).

That is, being registered to speech recognition engine 181 as candidate instruction word by the coding line E and F that third is sent using 372 In.

When coding line E and F are registered as candidate instruction word, speech recognition engine 181 is by the voice signal and note of voice E Candidate instruction the word E and F of volume are compared, and determine identical or similar coding line, and identify voice E(706).

First controller 180 can send display unit 130(707 for recognition result), and display unit 130 can be with Show speech recognition result.

First controller 180 sends third application 372(708 for control command according to speech recognition result).Third is answered The voice E executive control operation identified with 372 by the control of third controller 380, basis.If according to speech recognition knot The control command that fruit sends is the order for controlling display device 100, then the control command can be sent to first and answer 173 are applied with 172 or second.

As shown in Figure 5, if detecting the voice of user, display device 100 can will with had been enter into this The coding line (such as E and F) of the corresponding application of the device of voice is registered as candidate instruction word, the candidate instruction word based on registration It identifies voice, shows speech recognition result, and execute corresponding control operation.

Although Fig. 5, which is illustrated, inputs user speech by third voice acquisition unit 340, and third applies 372 finger Word is enabled to be registered as candidate instruction word, but exemplary embodiment is not limited to this.For example, can be obtained by various voices single Member input voice, and coding line can be sent by various native applications and development and application with registration/deletion candidate instruction word.

Hereinafter, the audio recognition method of speech recognition system accoding to exemplary embodiment is described with reference to the accompanying drawings.

Fig. 6 is the flow chart for showing the audio recognition method of speech recognition system accoding to exemplary embodiment.

As shown in Figure 6, speech recognition system, which can detecte, is input in multiple voice acquisition units 140,240 and 340 The voice of at least one (operation S810).The voice detected is converted into electric language by speech convertor 150,250 and 350 Sound signal.

First controller 180 receives the voice signal (operation S820).External voice acquisition is input to if detected The voice of unit 240 and 340 then can receive the voice signal by the first communication unit 160.

Speech recognition engine 181 registers candidate instruction word based on voice signal identification voice (operation S830).Note The candidate instruction word of volume can be the words being stored in advance in coding line table 171, either pass through native applications or exploitation It receives, and is stored in coding line table 171 using 172,173,372 and 472.

The voice (operation S840) of candidate instruction word identification user of the speech recognition engine 181 based on storage.

First controller 180 shows speech recognition result on display unit 130.Display unit 130 can show for The speech recognition result of the voice signal, candidate instruction word and application message according to speech recognition result.

First controller 180 generates control command according to speech recognition result and sends application for the control command (operation S860).Correspondingly, operation can be executed by the control command of generation.

Fig. 7 is the flow chart for showing the details for the process that voice input is detected in Fig. 6.

It as shown in Figure 7, can be can wherein input the voice of the voice of user as the display device of master device 100 Input pattern operates (operation S811).In the voice input pattern, various voice acquisition units 140,240 can be passed through With 340 input voices.

For example, the first controller 180 can from one in multiple voice acquisition units 140,240 and 340 detect about Wake up the speech (speaking) (operation S812) of keyword (wakeup keyword).The wake-up keyword makes it possible to lead to Special sound acquiring unit input voice is crossed, and can be pre-arranged.For example, the first voice of display device 100 obtains Unit 140 can set voice relevant to the TV of such as channel and volume etc. control to waking up keyword.Mobile phone 200 the second voice acquisition unit 240 can will be set as waking up keyword to the relevant voice of call, contact details etc..

Alternatively, if as manipulation predetermined button (voice input button) as a result, being obtained by multiple voices single One in member 140,240 and 340 has input trigger signal, then the first controller 180, which can detecte, obtains list by voice The voice (operation S813) of member input.For example, if user manipulates the voice input button provided in special sound acquisition device, Then detect the voice for being input to the voice-input unit of the special sound acquisition device.

According to the detection, the first controller 180 enables the voice in multiple voice acquisition units 140,240 and 340 The voice acquisition unit (operation S814) being input into.Because one of voice acquisition unit is activated, it is possible to prevent It detects unnecessary voice and prevents failure.

The voice acquisition unit that the voice signal is activated is sent to speech recognition engine 181 to execute speech recognition.

As shown in Figure 8, speech recognition engine 181 can be from least one of multiple applications 172,173,372 and 472 Candidate instruction word is received, and registers the candidate instruction word (operation S830).

Speech recognition engine 181 can determine whether registered candidate instruction word is identical as the voice signal received/ Similar (operation S841).

If it is determined that then speech recognition engine 181 determines identical/phase there are identical or similar candidate instruction word As coding line and execute speech recognition, and the first controller 180 shows speech recognition result on display unit 130 (operation S850).

If it is determined that then speech recognition engine 181 may determine whether without identical or similar candidate instruction word Receive and register the candidate instruction word (operation S842) of other application.First controller 180 can according to the user's choice or It inputs to receive and register the candidate instruction word of other application, and can receive and register the time of multiple applications with preset order Select coding line.The capacity for considering the first storage unit 170 of display device 100, deletes to the property of can choose the time of earlier registration Select coding line.

That is, behaviour can be sequentially performed if identical/similar without candidate instruction word and the candidate instruction word registered Make S842 and S841 to execute speech recognition.

If deciding not to receive and registering the candidate instruction word of other application, speech recognition engine in operation S842 181 stop speech recognition, and the first controller 180 can show that speech recognition fails on display unit 130.

Because of the voice that master device detection accoding to exemplary embodiment is inputted by receiving the various devices of user speech, So various voice acquisition device can be used, and associated service can be provided by the voice acquisition device (linked services).

Candidate instruction word for speech recognition is sent by multiple applications, and is registered/is deleted.It therefore, will not need not Strategic point increases the candidate instruction word of master device, processing speed can be prevented slack-off or discrimination decline, and language can be improved The gross efficiency of sound identifying system.

User more easily identifies speech recognition result, candidate instruction word, application message and obtains dress about various voices Device of the information and offer set for the candidate instruction word of speech recognition, and improve the convenience of user.

However it is not limited to this, exemplary embodiment can be written as computer program and can using computer can It is realized in the general purpose digital computer of read record medium operation described program.The example of computer readable recording medium includes that magnetic is deposited Storage media (for example, ROM, floppy disk, hard disk etc.) and optical recording medium (for example, CD-ROM or DVD).Meanwhile exemplary reality The computer program sent by the computer-readable transmission medium of such as carrier wave can be written as by applying example, and described in the operation It receives and realizes in the general purpose digital computer of program.In addition, although not requiring in all respects, one of device Or multiple units can include the place for the computer program that operation is stored in the computer-readable medium of such as local storage Manage device or microprocessor.

While there has been shown and described that several exemplary embodiments, it will be apparent, however, to one skilled in the art, that can change Become principle and spirit of these exemplary embodiments without departing from present inventive concept, the range of present inventive concept in claim and It is limited in its equivalent.

Claims

1. a kind of display device, comprising:

Display unit is configured as display image；

Storage unit is configured as storing data；

Communicator, is configured as and communication with external apparatus；And

Processor is configured as:

It is connected to the display device via the communicator based on the external device (ED), obtains and be included in the external device (ED) At least one of application the corresponding coding line of function；

The storage unit is controlled to store coding line obtained；

The coding line being stored in a storage unit is registered as into candidate instruction word in speech recognition engine,

Based on voice signal is received from voice acquisition unit, voice recognition processing is executed to know using the speech recognition engine Coding line corresponding with voice signal in the coding line that do not register,

Identify the application corresponding with voice signal of at least one application, identification coding line；And

Coding line based on identification corresponding with voice signal and external device (ED) described in the application control of identification.

2. display device according to claim 1, wherein be included in the application of at least one of external device (ED) and be stored in In the external device (ED).

3. display device according to claim 1, wherein at least one described application is configured as management described instruction Word.

4. display device according to claim 1, wherein from it is described at least one obtain using corresponding external device (ED) Coding line corresponding with the function of at least one application.

5. display device according to claim 1, wherein the processor be also configured to based on from another outside The corresponding another voice acquisition unit of device receives voice signal,

Coding line corresponding with the function of another application is obtained from another external device (ED).

6. display device according to claim 1, wherein the processor is also configured to based on voice recognition processing Unsuccessfully come based on similar with the speech recognition result of voice recognition processing one in the multiple candidate instruction words of user input selection A candidate instruction word.

7. display device according to claim 1, wherein the processor is also configured to depositing based on storage unit Reservoir state controls storage unit to delete the coding line of storage.

8. a kind of audio recognition method of display device, which comprises

It is connected to the display device based on external device (ED),

It obtains and include the corresponding coding line of function in the application of at least one of external device (ED)；

Store coding line obtained；

The coding line of storage is registered as into candidate instruction word,

Based on voice signal is received from voice acquisition unit, execute in coding line of the voice recognition processing to identify registration with The corresponding coding line of voice signal,

9. according to the method described in claim 8, wherein, it is described using being stored in be included at least one of external device (ED) In external device (ED).

10. according to the method described in claim 8, wherein, at least one described application is configured as management described instruction word.

11. according to the method described in claim 8, wherein, from it is described at least one using corresponding external device (ED) obtain with The corresponding coding line of function of at least one application.

12. according to the method described in claim 8, further include: it is based on obtaining from another voice corresponding with another external device (ED) Unit receives voice signal, obtains coding line corresponding with the function of another application from another external device (ED).

13. according to the method described in claim 8, further include: user input selection is unsuccessfully based on based on voice recognition processing A candidate instruction word similar with the speech recognition result of voice recognition processing in multiple candidate instruction words.

14. according to the method described in claim 8, further include: the memory state of the storage device based on store instruction word is deleted Except the coding line of storage.