WO2015030474A1 - 음성 인식을 위한 전자 장치 및 방법 - Google Patents
음성 인식을 위한 전자 장치 및 방법 Download PDFInfo
- Publication number
- WO2015030474A1 WO2015030474A1 PCT/KR2014/007951 KR2014007951W WO2015030474A1 WO 2015030474 A1 WO2015030474 A1 WO 2015030474A1 KR 2014007951 W KR2014007951 W KR 2014007951W WO 2015030474 A1 WO2015030474 A1 WO 2015030474A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- voice
- voice recognition
- command
- processor
- electronic device
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 230000009471 action Effects 0.000 claims description 5
- 238000004891 communication Methods 0.000 description 31
- 230000006870 function Effects 0.000 description 26
- 230000008569 process Effects 0.000 description 22
- 230000001413 cellular effect Effects 0.000 description 17
- 238000012545 processing Methods 0.000 description 11
- 230000014509 gene expression Effects 0.000 description 7
- 238000007726 management method Methods 0.000 description 7
- 230000005236 sound signal Effects 0.000 description 6
- 238000012546 transfer Methods 0.000 description 5
- 230000004913 activation Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 230000005611 electricity Effects 0.000 description 3
- 230000036541 health Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000001629 suppression Effects 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 229920001621 AMOLED Polymers 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000002591 computed tomography Methods 0.000 description 2
- 239000004020 conductor Substances 0.000 description 2
- 238000002567 electromyography Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000002583 angiography Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 230000010267 cellular communication Effects 0.000 description 1
- 230000009849 deactivation Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000002592 echocardiography Methods 0.000 description 1
- -1 electricity Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 238000001646 magnetic resonance method Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 229910052724 xenon Inorganic materials 0.000 description 1
- FHNFHKCVQCLJFQ-UHFFFAOYSA-N xenon atom Chemical compound [Xe] FHNFHKCVQCLJFQ-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
- G10L17/24—Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
- G06F1/3231—Monitoring the presence, absence or movement of users
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/32—Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present disclosure relates to an electronic device, and various embodiments relate to a configuration for speech recognition.
- each person has his or her own unique voice, and this voice itself can be used as a means for authentication.
- the speech recognition apparatus may recognize a specific person's voice by using a voice recognition model in which the specific person's voice and information on the voice are collected, and this is called a speaker verification method.
- a speech recognition apparatus may distinguish the speech of a spoken person using a pre-trained speech recognition model to recognize the speech of a plurality of people, which is called speaker identification.
- Speech recognition devices using the speaker verification method or the speaker identification method can train a speech recognition model using a specific phrase, and in this case, it is implemented to recognize a voice only when a specific speaker speaks a specific phrase, thereby increasing the security performance.
- the speech recognition apparatus may recognize a speech by using an isolated word recognition method that recognizes only a predetermined specific word.
- the isolated word recognition method refers to a method of generating a template for each specific word and comparing it with the input speech. Since the speech recognition apparatus using the isolated word recognition method recognizes only a predetermined word, the speech recognition rate is relatively high, and the speech recognition failure rate due to the ambient noise is relatively low. Accordingly, the isolated word recognition method can be easily used in a portable terminal device because a large amount of calculation is possible compared to large vocabulary speech recognition (LVSR) and natural language speech recognition, in which all speeches can be converted into text. have.
- LVSR large vocabulary speech recognition
- the conventional speech recognition apparatus recognizes a speech using a speaker verification method or a speaker identification method or recognizes a speech using an isolated word recognition method.
- the conventional speech recognition apparatus has a problem that it is difficult to perform a speech recognition method having a large amount of calculation when a low power processor is mounted.
- the conventional speech recognition apparatus has a disadvantage in that it consumes a lot of power because it performs high performance preprocessing and high performance speech recognition when a high performance processor is mounted.
- various embodiments of the present disclosure provide an electronic device and a method for speech recognition using two or more processors, such as a processor having low power consumption and a processor performing high performance speech recognition.
- a method using an electronic device may include: obtaining, by at least one of a first voice recognition device and a second voice recognition device, a first voice; Recognizing, through an external electronic device, a second voice that is additionally recognized when a predetermined command is included in the first voice acquired by the first voice recognition device; Recognizing a second voice that is additionally recognized when a predetermined command is included in the first voice acquired by the second voice recognition device; And an associated operation based on the recognized second voice.
- the electronic device may include at least one of a first voice recognition device and a second voice recognition device for acquiring a first voice, and the first voice device may be acquired by the first voice recognition device. Recognizing a second voice additionally recognized when the first voice includes a predetermined command through the external electronic device, and additionally recognized when the first voice acquired by the second voice recognition device includes the predetermined command After recognizing the voice, the associated operation may be performed based on the recognized second voice.
- a voice recognition system capable of maintaining a standby state at all times with low power consumption is possible, a natural language voice recognition capable of answering various queries to a user, and a specific voice command for an application requiring a fast action response It is possible to respond to the present invention, and even if distortion occurs in a voice signal input for voice recognition, voice recognition having a high voice recognition rate is possible.
- FIG. 1 illustrates a network environment including an electronic device according to various embodiments of the present disclosure.
- FIG. 2 illustrates a configuration of a first speech recognition processor and a second speech recognition processor according to various embodiments.
- FIG 3 illustrates a configuration of a first speech recognition processor and a second speech recognition processor according to various embodiments.
- FIG. 4 illustrates a configuration of a first speech recognition processor and a second speech recognition processor according to various embodiments.
- FIG. 5 is a diagram illustrating a configuration of a first voice recognition processor and a second voice recognition processor according to various embodiments.
- FIG. 6 illustrates a configuration of a first voice recognition processor, a second voice recognition processor, and a third voice recognition processor according to various embodiments of the present disclosure.
- FIG. 7 illustrates a configuration of a first voice recognition processor, a second voice recognition processor, and a third voice recognition processor according to various embodiments of the present disclosure.
- FIG. 8 illustrates a configuration of a preprocessor according to various embodiments.
- FIG. 9 is a flowchart illustrating a process of performing voice recognition through a first voice recognition processor or a second voice recognition processor according to various embodiments of the present disclosure.
- FIG. 10 is a flowchart illustrating a process of performing voice recognition by a controller through a first voice recognition processor or a second voice recognition processor according to various embodiments of the present disclosure.
- FIG. 11 is a flowchart illustrating a process of performing voice recognition by a controller through a first voice recognition processor or a second processor according to various embodiments of the present disclosure.
- FIG. 12 is a flowchart illustrating a process of performing voice recognition by a controller through a first voice recognition processor or a second voice recognition processor according to various embodiments of the present disclosure.
- FIG. 13 illustrates a process of performing voice recognition through a first voice recognition processor, a second voice recognition processor, and a third voice recognition processor according to various embodiments.
- FIG. 14 illustrates a process of performing voice recognition through a first voice recognition processor, a second voice recognition processor, and a third voice recognition processor according to various embodiments.
- FIG. 15 illustrates a process of performing voice recognition through a first voice recognition processor, a second voice recognition processor, and a third voice recognition processor according to various embodiments.
- FIG. 16 illustrates a process of upgrading a speech recognition model through a third speech recognition processor according to various embodiments.
- 17 is a block diagram of an electronic device according to various embodiments of the present disclosure.
- the expression “or” includes any and all combinations of words listed together.
- “A or B” may include A, may include B, or may include both A and B.
- Expressions such as "first,” “second,” “first,” “second,” and the like used in various embodiments of the present invention may modify various elements of the various embodiments, but limit the corresponding elements. I never do that.
- the above expressions do not limit the order and / or importance of the corresponding elements.
- the above expressions may be used to distinguish one component from another.
- both a first user device and a second user device are user devices and represent different user devices.
- the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component.
- a component When a component is said to be “connected” or “connected” to another component, the component may or may not be directly connected to or connected to the other component. It is to be understood that there may be new other components between the other components. On the other hand, when a component is referred to as being “directly connected” or “directly connected” to another component, it will be understood that there is no new other component between the component and the other component. Should be able.
- An electronic device may be a device including a display function.
- the electronic device may be a smartphone, a tablet personal computer, a mobile phone, a video phone, an e-book reader, a desktop personal computer, a laptop.
- HMD head-mounted-device
- the electronic device may be a smart home appliance with a display function.
- Smart home appliances are, for example, electronic devices such as televisions, digital video disk (DVD) players, audio, refrigerators, air conditioners, cleaners, ovens, microwave ovens, washing machines, air purifiers, set-top boxes, TVs. It may include at least one of a box (eg, Samsung HomeSync TM , Apple TV TM , or Google TV TM ), game consoles, electronic dictionaries, electronic keys, camcorders, or electronic photo frames.
- the electronic device may include a variety of medical devices (eg, magnetic resonance angiography (MRA), magnetic resonance imaging (MRI), computed tomography (CT), imaging device, ultrasound, etc.), navigation device, and GPS receiver. (global positioning system receiver), event data recorder (EDR), flight data recorder (FDR), automotive infotainment devices, marine electronic equipment (e.g. marine navigation systems and gyro compasses), avionics, It may include at least one of a security device, a vehicle head unit, an industrial or home robot, an automatic teller's machine (ATM) of a financial institution, or a point of sales (POS) of a store.
- MRA magnetic resonance angiography
- MRI magnetic resonance imaging
- CT computed tomography
- imaging device e.g., ultrasound, etc.
- GPS receiver global positioning system receiver
- EDR event data recorder
- FDR flight data recorder
- automotive infotainment devices e.g. marine navigation systems and gyro compasse
- an electronic device may be a piece of furniture or building / structure including a display function, an electronic board, an electronic signature receiving device, a projector, or various It may include at least one of the measuring device (for example, water, electricity, gas, or radio wave measuring device).
- An electronic device according to various embodiments of the present disclosure may be a combination of one or more of the aforementioned various devices.
- an electronic device according to various embodiments of the present disclosure may be a flexible device.
- the electronic device according to various embodiments of the present disclosure is not limited to the above-described devices.
- the term “user” used in various embodiments may refer to a person who uses an electronic device or a device (eg, an artificial intelligence electronic device) that uses an electronic device.
- FIG. 1 illustrates a network environment including an electronic device 101 according to various embodiments of the present disclosure.
- the electronic device 101 may include a bus 110, a processor 120, a memory 130, an input / output interface 140, a display 150, a communication interface 160, and a first voice recognition processor. 170, the second voice recognition processor 180 may be included.
- the bus 110 may be a circuit connecting the above-described components to each other and transferring communication (eg, a control message) between the above-described components.
- the processor 120 may include, for example, the above-described other components (eg, the memory 130, the input / output interface 140, the display 150, and the communication interface) through the bus 110.
- 160, the first voice recognition processor 170, the second voice recognition processor 180, and the like, may receive a command, decode the received command, and execute an operation or data processing according to the decoded command. .
- the memory 130 may include the processor 120 or other components (eg, the input / output interface 140, the display 150, the communication interface 160, the first voice recognition processor 170, Instructions or data received from the second voice recognition processor 180 or generated by the processor 120 or other components.
- the memory 130 may include, for example, programming modules such as a kernel 131, middleware 132, an application programming interface (API) 133, or an application 134. Each of the aforementioned programming modules may be composed of software, firmware, hardware, or a combination of two or more thereof.
- the kernel 131 is a system resource (e.g., used to execute an operation or function implemented in the other other programming modules, for example, the middleware 132, the API 133, or the application 134).
- the bus 110, the processor 120, or the memory 130 may be controlled or managed.
- the kernel 131 may provide an interface that allows the middleware 132, the API 133, or the application 134 to access and control or manage individual components of the electronic device 101. have.
- the middleware 132 may serve as an intermediary to allow the API 133 or the application 134 to communicate with the kernel 131 to exchange data.
- the middleware 132 may be associated with work requests received from the application 134, for example, in at least one of the applications 134. Control the work request (eg, scheduling or load balancing) using a method such as assigning a priority to use the bus 110, the processor 120, or the memory 130. Can be.
- the API 133 is an interface for the application 134 to control functions provided by the kernel 131 or the middleware 132. For example, file control, window control, image processing, character control, or the like. It may include at least one interface or function (eg, a command) for.
- the application 134 may be an SMS / MMS application, an email application, a calendar application, an alarm application, a health care application (for example, an application for measuring exercise amount or blood sugar) or an environment information application ( For example, an application that provides barometric pressure, humidity, or temperature information). Additionally or alternatively, the application 134 may be an application related to information exchange between the electronic device 101 and an external electronic device (eg, the electronic device 104). An application related to the information exchange may include, for example, a notification relay application for delivering specific information to the external electronic device, or a device management application for managing the external electronic device. Can be.
- the notification transmission application may be configured to receive notification information generated by another application of the electronic device 101 (eg, an SMS / MMS application, an email application, a health care application, or an environmental information application). And to the device 104. Additionally or alternatively, the notification delivery application may receive notification information from an external electronic device (for example, the electronic device 104) and provide it to the user.
- the device management application may be, for example, a function (eg, an external electronic device itself (or some component part) of at least a part of an external electronic device (for example, the electronic device 104) that communicates with the electronic device 101.
- Control the turn-on / turn-off or brightness (or resolution) of the display an application running on the external electronic device or a service provided by the external electronic device (eg, a call service or a message service) (eg, installation). , Delete or update).
- a service provided by the external electronic device eg, a call service or a message service
- the application 134 may include an application designated according to an attribute (eg, a type of electronic device) of the external electronic device (eg, the electronic device 104).
- an attribute eg, a type of electronic device
- the application 134 may include an application related to music playback.
- the external electronic device is a mobile medical device
- the application 134 may include an application related to health care.
- the application 134 may include at least one of an application designated to the electronic device 101 or an application received from an external electronic device (for example, the server 106 or the electronic device 104). .
- the input / output interface 140 may receive a command or data input from a user through an input / output device (eg, a sensor, a keyboard, or a touch screen), for example, the processor 120 or the memory through the bus 110. 130, the communication interface 160, the first voice recognition processor 170, and the second voice recognition processor 180.
- the input / output interface 140 may provide data about the user's touch input through the touch screen to the processor 120.
- the input / output interface 140 may include, for example, the processor 120, the memory 130, the communication interface 160, the first voice recognition processor 170, via the bus 110.
- the command or data received from the second voice recognition processor 180 may be output through the input / output device (eg, a speaker or a display).
- the input / output interface 140 may output voice data processed by the processor 120 to a user through a speaker.
- the display 150 may display various information (eg, multimedia data or text data) to the user.
- the communication interface 160 may connect communication between the electronic device 101 and an external device (for example, the electronic device 104 or the server 106).
- the communication interface 160 may be connected to the network 162 through wireless or wired communication to communicate with the external device.
- the wireless communication may include, for example, wireless fidelity (WiFi), Bluetooth (BT), near field communication (NFC), global positioning system (GPS) or cellular communication (eg, LTE, LTE-A, CDMA, WCDMA, UMTS). , WiBro, or GSM).
- the wired communication may include, for example, at least one of a universal serial bus (USB), a high definition multimedia interface (HDMI), a reduced standard 232 (RS-232), or a plain old telephone service (POTS).
- USB universal serial bus
- HDMI high definition multimedia interface
- RS-232 reduced standard 232
- POTS plain old telephone service
- the network 162 may be a telecommunications network.
- the communication network may include at least one of a computer network, the Internet, the Internet of things, and a telephone network.
- a protocol eg, a transport layer protocol, a data link layer protocol, or a physical layer protocol
- a protocol for communication between the electronic device 101 and an external device may include an application 134, an application programming interface 133, It may be supported by at least one of the middleware 132, the kernel 131, or the communication interface 160.
- the first voice recognition processor 170 and / or the second voice recognition processor 180 may include other components (eg, the processor 120, the memory 130, the input / output interface 140, or At least some of the information obtained from the communication interface 160, etc.) may be processed and provided to the user in various ways.
- other components eg, the processor 120, the memory 130, the input / output interface 140, or At least some of the information obtained from the communication interface 160, etc.
- the first voice recognition processor 170 may recognize the first voice received from the input / output interface 140 by using the processor 120 or independently of the first voice recognition processor, and thus the first command may be added to the first voice. Can be included. According to various embodiments of the present disclosure, the first command may be preset by a specific word or set by a user.
- the first voice recognition processor 170 additionally transmits the second voice received to the external electronic device (eg, the electronic device 104) so that the external electronic device generates the first voice. 2 It is possible to perform voice recognition for the voice.
- the second voice recognition processor 180 recognizes the first voice and determines whether the first voice includes the first command, and if the first command is included, the second voice recognition processor additionally recognizes the second voice and receives the second voice.
- the second command included in the pre-stored voice command set may be determined. According to various embodiments of the present disclosure, the second command may include a plurality of words.
- the second voice recognition processor 180 may perform an operation corresponding to the second command.
- the second voice recognition processor 180 may transmit a signal for performing an operation corresponding to the second command to the processor 120 so that the processor 120 may perform the operation.
- FIG. 2 illustrates a configuration of a first speech recognition processor and a second speech recognition processor according to various embodiments.
- the electronic device 101 may include a first voice recognition processor 170, a second voice recognition processor 180, a microphone 400, a speaker 410, and an audio module 420. .
- the microphone 400 may receive a voice signal.
- the microphone may be referred to as a voice input unit.
- the speaker 410 includes a speaker and outputs a voice signal.
- the speaker 410 may output an audio signal generated by the execution of an application or a program.
- the speaker may be referred to as an audio output unit.
- the audio module 420 is connected to the first voice recognition processor 170, the second voice recognition processor 180, the microphone 400, and the speaker 410 to convert an analog voice signal into a digital voice signal or to convert a digital voice signal. Perform audio signal processing to convert to analog voice signals. In addition, the audio module 420 may perform signal processing such as automatic gain control, equalization, or the like on the converted digital signal. The audio module 420 may transmit and receive a voice signal of an application or a program.
- the audio module 420 may be implemented in a form of receiving separate power and may be selectively implemented. In another embodiment, the audio module 420 may be implemented in each of the first voice recognition processor 170 and the second voice recognition processor 180 without receiving a separate power.
- the first voice recognition processor 170 includes a first voice recognition processor 110, and the first voice recognition processor 110 includes a first preprocessor 111 and a first voice recognition model storage 112.
- the first voice recognition unit 113 may be included.
- the speech recognition model storage unit may be referred to as a speech recognition engine storage unit.
- the first voice recognition processor 170 is a low power processor operating at low power and may perform voice recognition using a first voice recognition model.
- the first voice recognition processor 170 may include a first voice recognition processor 110 including a first preprocessor 111, a first voice recognition model storage 112, and a first voice recognition unit 113. Can be.
- the first preprocessor 111 corrects the voice signal input from the microphone 400 and outputs the first voice recognition unit 113 to the first voice recognition unit 113 before the first voice recognition unit 113 performs voice recognition.
- the first preprocessor 111 may be selectively implemented, and may be omitted depending on the implementation.
- the first speech recognition model storage unit 112 stores a first speech recognition model including various speech recognition algorithms used for speech recognition, and may be generated or updated by speech recognition training.
- the first speech recognition model may include a first level speech recognition algorithm capable of recognizing a first level speech including a predetermined command by a specific word or a combination of one or more words.
- the first speech recognition model may be a speaker recognition algorithm.
- the first voice recognition unit 113 performs voice recognition using the first voice recognition model. According to various embodiments of the present disclosure, the first voice recognition unit 113 may recognize the first level voice in the first voice recognition processor 170 operating at low power. For example, the first voice recognition unit 113 may recognize a command composed of a combination of predetermined words such as “Hi, Galaxy”.
- the second voice recognition processor 180 includes a second voice recognition processor 220, a controller 210, and an audio manager 230, and the second voice recognition processor 220 includes a second preprocessor 221. ), A second voice recognition model storage unit 222, and a second voice recognition unit 223.
- the audio manager may be referred to as a voice manager.
- the second voice recognition processor 180 includes a controller 210, a second voice recognition processor 220, and an audio manager 230. According to various embodiments of the present disclosure, the second voice recognition processor 180 may further include a third voice recognition processor including a third preprocessor, a third storage unit, and a third voice recognition unit. Here, the second voice recognition processor 180 may operate at different power from the first voice recognition processor 170.
- the controller 210 controls the overall operations of the first voice recognition processor 170 and / or the second voice recognition processor 180, and performs voice recognition control, signal control between components, and the like.
- the controller 210 may be connected to the audio manager 230 to receive a voice input / output signal.
- the controller 210 may control operations of the first voice recognition processor 110 and the second voice recognition processor 220 by using information of an application and a program and information received from the audio manager 230.
- the controller 210 has been described as being included in the second voice recognition processor 180, but the present invention is not limited thereto, and the controller 210 may be included in the first voice recognition processor 170. It may be configured separately from the first voice recognition processor 170 and the second voice recognition processor 180.
- the first voice recognition processor 170 and / or the second voice recognition processor 180 may control respective operations.
- the second voice recognition processor 220 may include a second preprocessor 221, a second voice recognition model storage unit 222, and a second voice recognition unit 223.
- the second preprocessor 221 modifies the voice signal input from the microphone 400 and outputs the voice signal to the second voice recognition unit 223 before the second voice recognition unit 223 performs voice recognition.
- the second preprocessor 221 may be optionally implemented, and may be omitted according to the implementation.
- the second speech recognition model storage unit 222 stores a second speech recognition model used for speech recognition by the second speech recognition unit 223.
- the second voice recognition model may recognize a second level voice including a command consisting of one word as well as a first level voice that can be recognized by the first voice recognition model. It may include a second level speech recognition algorithm. The second level speech recognition algorithm may recognize more commands than the first level speech recognition algorithm. In addition, the second speech recognition model may be generated or updated by speech recognition training.
- the second voice recognition unit 223 performs second level voice recognition by using the second voice recognition model.
- the second voice recognition unit 223 may perform higher performance voice recognition than the first voice recognition unit 113.
- the second voice recognition unit 223 may recognize a command composed of at least one word such as “Play”, “Stop”, “Pause”, or the like.
- the audio manager 230 is directly or indirectly connected to the microphone 400 and the speaker 410 to manage the input or output of the voice signal. In addition, the audio manager 230 may transmit the voice signal output from the audio module 420 to the second preprocessor 221. The audio manager 230 may manage audio signal input and output of an application or a program, and determine whether an audio signal is output from the speaker 410.
- FIG 3 illustrates a configuration of a first voice recognition processor and a second voice recognition processor according to various embodiments of the present disclosure.
- the electronic device 101 may include an audio module and may be included in the first voice recognition processor 170 and the second voice recognition processor 180. Each component of the electronic device 101 may be performed in the same manner as described above with reference to FIG. 2.
- the controller 210 may receive a voice signal input from the microphone 400 through the audio manager 230.
- the audio manager 230 may receive a voice signal from the microphone 400 and transmit the voice signal to the speaker 410 so that the voice is output through the speaker 410.
- the second voice recognition unit 223 may perform voice recognition using the first voice recognition model of the first voice recognition processor 170. Also, the first voice recognition unit 113 and / or the second voice recognition unit 223 may perform voice recognition for recognizing a specific speech of a specific speaker.
- FIG. 4 illustrates a configuration of a first speech recognition processor and a second speech recognition processor according to various embodiments.
- the electronic device 101 may include a second voice recognition processor 180 including two voice recognition processors.
- the second voice recognition processor 180 may include a second voice recognition processor 220 and a third voice recognition processor 240.
- the second speech recognition model of the second speech recognition processor 220 and the third speech recognition model of the third speech recognition processor 240 may include different speech recognition algorithms.
- the third speech recognition model may include a third level speech recognition algorithm for recognizing a command composed of a combination of a plurality of words.
- the third level voice may be a phrase or / and sentence composed of a combination of a plurality of words such as “open camera”.
- any one of the second voice recognition model and the third voice recognition model may be the same recognition model as the first voice recognition model.
- the third voice recognition processor 240 may include a third preprocessor 241, a third voice recognition model storage unit 242, and a third voice recognition unit 243.
- the third preprocessor 241 modifies the voice signal input from the microphone 400 and outputs the modified voice signal to the third voice recognition unit 243 before the third voice recognition unit 243 performs voice recognition.
- the third preprocessor 241 may be selectively implemented and may be omitted depending on the implementation.
- the third voice recognition model storage unit 242 stores a third voice recognition model used for voice recognition by the third voice recognition unit 243.
- the third speech recognition model may include a third level speech recognition algorithm capable of recognizing a third level speech including a phrase or a sentence composed of a plurality of words.
- the third level speech recognition algorithm may recognize more commands than the second level speech recognition algorithm.
- the third level speech recognition algorithm may be a natural language recognition algorithm, and may be an algorithm for recognizing a command composed of a combination of a plurality of words such as “open camera”.
- the third voice recognition unit 243 may perform third level voice recognition using the third voice recognition model.
- FIG. 5 is a diagram illustrating a configuration of a first voice recognition processor and a second voice recognition processor according to various embodiments.
- the electronic device 101 may further include a first voice recognition processor 170, and a voice processor 150, and a voice signal input from the microphone 400 may be transmitted to the voice processor 150.
- the first voice recognition processor 170 may operate as an audio module.
- the voice processor 120 may convert a voice signal input from the microphone 400, that is, an analog signal, into a digital signal, and output the digital signal, or perform voice processing such as AGC (Automatic Gain Control).
- the voice signal processed by the voice processor 120 may be transferred to the second voice recognition processor 220 through the audio manager 230 of the second voice recognition processor 180 or may be used in an application or a program.
- the first voice recognition unit 113 may perform voice recognition using a first voice recognition model.
- the first voice recognition model may include a first level voice recognition algorithm and may be a recognition model for recognizing a voice input or trained by a user.
- the second voice recognition unit 223 may perform voice recognition using a second voice recognition model specialized for the application executed when the application is executed.
- the second speech recognition model may be a word recognition model that can recognize several words or a large vocabulary speech recognition model.
- FIG. 6 illustrates a configuration of a first voice recognition processor, a second voice recognition processor, and a third voice recognition processor according to various embodiments of the present disclosure.
- the electronic device 101 may include a first voice recognition processor 170 and a second voice recognition processor 180, and the external electronic device 140 may include a third voice recognition processor 190. It may include.
- the first voice recognition processor 170 may include a first preprocessor 111, a first voice recognition model storage 112, and a first voice recognition 113.
- the first preprocessor 111 modifies the received first voice and transmits the received first voice to the first voice recognition unit 113.
- the first voice recognition model storage unit 112 may store a first voice recognition model including a first level voice recognition algorithm capable of recognizing a first level voice.
- the first voice recognition unit 113 may recognize the first voice by using the first voice recognition model, and determine whether the first voice is included in the recognized first voice. When it is determined that the first voice includes the first command, the first voice recognition unit 113 may transmit the input second voice to the third voice recognition module 190. In addition, when it is determined that the first command is not included in the first voice, the first voice recognition unit 113 ends the voice recognition.
- the second voice recognition processor 180 may include a controller 210, a second preprocessor 221, a second voice recognition model storage unit 222, and a first voice recognition unit 223.
- the controller 210 controls overall operations of the first voice recognition processor 170 and / or the second voice recognition processor 180, and may perform voice recognition control, signal control between components, and the like. According to various embodiments of the present disclosure, when the first voice is received, the controller 210 transmits the first voice to the second preprocessor 221, and when the voice recognition result is received by the second voice recognition unit 223. An operation corresponding to a voice recognition result may be performed.
- the controller 210 has been described as being included in the second voice recognition processor 180, but is not limited thereto.
- the controller 210 may be included in the first voice recognition processor 170.
- the first speech recognition processor 170 and the second speech recognition processor 180 may be configured separately.
- the first voice recognition processor 170 and / or the second voice recognition processor 180 may control respective operations.
- the second preprocessor 221 modifies the voice signal and outputs it to the second voice recognition unit 223 before the second voice recognition unit 223 performs voice recognition.
- the second preprocessor 221 may be optionally implemented, and may be omitted according to the implementation.
- the second voice recognition model storage unit 222 stores a second voice recognition model including a second level voice recognition algorithm capable of recognizing a second level voice.
- the second level of voice may include the first level of voice.
- the second voice recognition unit 223 may recognize the first voice by using the second voice recognition model, and determine whether the recognized first voice includes the first command. If it is determined that the first voice includes the first command, the second voice recognition unit 223 may recognize the input second voice and determine whether the recognized second voice includes the second command. If it is determined that the first voice does not include the first command, the second voice recognition unit 223 ends the voice recognition.
- the second voice recognition unit 223 transmits the voice recognition result to the control unit 210, and the control unit 210 performs an operation corresponding to the second command. Can be. If it is determined that the second voice does not include the second command, the second voice recognition unit 223 ends the voice recognition.
- the third voice recognition processor 190 may include a third preprocessor 310, a third voice recognition model storage 320, and a third voice recognition 330.
- the third preprocessor 310 modifies the voice signal and outputs the modified voice signal to the third voice recognition unit 330 before the third voice recognition unit 330 performs voice recognition.
- the third preprocessor 310 may be selectively implemented and may be omitted depending on the implementation.
- the third voice recognition model storage 320 stores a third voice recognition model including a third level voice recognition algorithm capable of recognizing a third level voice.
- the third voice recognition unit 330 may recognize the second voice by using the third voice recognition model and determine whether the recognized second voice includes the second command and / or the third command. If it is determined that the second voice includes the second command and / or the third command, the third voice recognition unit 330 may transmit a voice recognition result to the second voice recognition processor 180. If it is determined that the second voice does not include the second command and / or the third command, the third voice recognition unit 330 ends the voice recognition.
- the second voice recognition processor 180 may perform an operation corresponding to the second command and / or the third command.
- the electronic device may include at least one of a first voice recognition device and a second voice recognition device for acquiring a first voice, and the first voice device may be acquired by the first voice recognition device. Recognizing a second voice additionally recognized when the first voice includes a predetermined command through the external electronic device, and additionally recognized when the first voice acquired by the second voice recognition device includes the predetermined command After recognizing the voice, the associated operation may be performed based on the recognized second voice.
- FIG. 7 illustrates a configuration of a first voice recognition processor, a second voice recognition processor, and a third voice recognition processor according to various embodiments of the present disclosure.
- the first voice recognition processor 170 and the second voice recognition processor 180 may be included, and the external electronic device 140 may include a third voice recognition processor 190.
- the first voice recognition processor 170 may include a first preprocessor 111, a first voice recognition model storage 112, and a first voice recognition 113.
- the first preprocessor 111 modifies the received first voice and transmits the received first voice to the first voice recognition unit 113.
- the first voice recognition model storage unit 112 may store a first voice recognition model including a first level voice recognition algorithm capable of recognizing a first level voice.
- the first voice recognition unit 113 may recognize the first voice by using the first voice recognition model, and determine whether the first voice is included in the recognized first voice. When it is determined that the first voice includes the first command, the first voice recognition unit 113 may transmit the input second voice to the third voice recognition module 190. If it is determined that the first voice is not included in the first voice, the first voice recognition unit 113 may transmit the first voice to the second voice recognition processor 180.
- the second voice recognition processor 180 may include a controller 210, a second preprocessor 221, a second voice recognition model storage unit 222, and a first voice recognition unit 223.
- the controller 210 transmits the first voice to the second preprocessor 221, and when the voice recognition result is received by the second voice recognition unit 223, the controller 210 performs an operation corresponding to the voice recognition result. Can be done.
- the second preprocessor 221 modifies the voice signal and outputs it to the second voice recognition unit 223 before the second voice recognition unit 223 performs voice recognition.
- the second preprocessor 221 may be optionally implemented, and may be omitted according to the implementation.
- the second voice recognition model storage unit 222 stores a second voice recognition model including a second level voice recognition algorithm capable of recognizing a second level voice.
- the second voice recognition unit 223 may recognize the first voice by using the second voice recognition model, and determine whether the recognized first voice includes the first command. If it is determined that the first voice includes the first command, the second voice recognition unit 223 may recognize the input second voice and determine whether the recognized second voice includes the second command. If it is determined that the first voice does not include the first command, the second voice recognition unit 223 ends the voice recognition.
- the second voice recognition unit 223 transmits the voice recognition result to the control unit 210, and the control unit 210 performs an operation corresponding to the second command. Can be. If it is determined that the second voice does not include the second command, the second voice recognition unit 223 ends the voice recognition.
- the second voice recognizer 223 may determine whether the first voice includes the second command. have. If it is determined that the first voice includes the second command, the second voice recognition unit 223 may transmit a voice recognition result to the controller 210.
- the third voice recognition processor 190 may include a third preprocessor 310, a third voice recognition model storage 320, and a third voice recognition 330.
- the third preprocessor 310 modifies the voice signal and outputs the modified voice signal to the third voice recognition unit 330 before the third voice recognition unit 330 performs voice recognition.
- the third preprocessor 310 may be selectively implemented and may be omitted depending on the implementation.
- the third voice recognition model storage 320 stores a third voice recognition model including a third level voice recognition algorithm capable of recognizing a third level voice.
- the third voice recognition unit 330 may recognize the second voice by using the third voice recognition model and determine whether the recognized second voice includes the second command and / or the third command. If it is determined that the second voice includes the second command and / or the third command, the third voice recognition unit 330 may transmit a voice recognition result to the second voice recognition processor 180. If it is determined that the second voice does not include the second command and / or the third command, the third voice recognition unit 330 ends the voice recognition.
- FIG. 8 illustrates a configuration of a preprocessor according to various embodiments.
- Pre-processing unit 800 is an adaptive echo canceller (AEC) 801, noise suppression (Noise Suppression, NS) 802, end-point detection (EPD) ) 803, and an automatic gain control (AGC) 804.
- AEC adaptive echo canceller
- NS noise suppression
- EPD end-point detection
- AGC automatic gain control
- the adaptive echo canceller 801 removes an echo from the voice signal input from the microphone 510 based on the reference signal. For example, if a voice signal is input when an application for outputting a sound such as a call, ring tone, music player, camera, etc. is executed by the second voice recognition processor 200, the adaptive echo canceller 801 may determine the input voice signal. The echoes input together by the application execution may be removed and transmitted to the voice recognition unit 820.
- the noise suppression unit 802 suppresses noise from the input voice signal.
- the end point detector 803 detects an end point of the voice in order to find a portion in which the voice actually exists in the input voice signal.
- the automatic gain control unit 804 performs an operation of automatically receiving a good voice signal even when the propagation intensity of the input voice signal changes.
- Each of these components may not be included in the first preprocessor 111 in order to operate at low power, and all of the components may be included in the second preprocessor 221 to improve speech recognition performance.
- the embodiment of the present invention is not limited thereto, and each component may be included or excluded in various ways.
- the first voice recognition processor 170 may be implemented as a low power processor. Even when the second voice recognition processor 200 is in the sleep mode, the first voice recognition processor 170 may wait for input of a voice signal.
- the dormant mode may be a state in which no power is supplied, and may be turned off on the screen of the electronic device 101 and may be a state in which only a minimum power is supplied to operate only necessary components.
- the first voice recognition unit 113 of the first voice recognition processor 100 When a voice is input from the microphone 130, the first voice recognition unit 113 of the first voice recognition processor 100 performs voice recognition on the input voice. If the input voice includes a command for activating the second voice recognition processor 180, the first voice recognition unit 113 transmits a signal for activating the second voice recognition processor 180 to the controller 210. To pass. Thereafter, the controller 210 may activate the second voice recognition processor 220 to perform voice recognition.
- the controller 210 may perform voice recognition through the first voice recognition processor 170, using application information and information received from the audio manager 230. To control the operations of the first voice recognition processor 170 or the operations of the first voice recognition processor 110 and the second voice recognition processor 220.
- the electronic device 101 when the voice is received, the electronic device 101 performs voice recognition by the first voice recognition processor 170 and processes an audio signal such as an audio module, a speaker, an audio manager, and the like.
- the speech recognition of the first speech recognition processor 170 may be stopped, and the speech recognition may be performed by the second speech recognition processor 180 based on the operation of.
- voice recognition may be performed by selecting one of a processor operating at low power and a high performance processor according to whether an audio signal is output from a speaker.
- Speech recognition may be performed using a processor.
- FIG. 9 is a flowchart illustrating a process of performing voice recognition through a first voice recognition processor or a second voice recognition processor according to various embodiments of the present disclosure.
- step 910 the controller 210 proceeds to step 910 to deactivate the first voice recognition processor 170, and if no voice is input, proceeds to step 930.
- the deactivation means an operation of stopping the voice recognition of the first voice recognition processor 110 by stopping the power supply to the first voice recognition processor 110 in the first voice recognition processor 170.
- the controller 210 controls the audio manager 230 to determine whether a voice is output from the speaker 410.
- step 910 the controller 210 proceeds to step 920 to perform voice recognition through the second voice recognition processor 220 of the second voice recognition processor 180.
- step 930 the controller 210 deactivates the states of the second voice recognition processor 220 and the audio module 420 of the second voice recognition processor 180. In other words, the controller 210 stops the power supply to the second voice recognition processor 220 and the audio module 420 to switch to the dormant state. That is, the controller 210 may stop the voice recognition operation of the second voice recognition processor 220 and the audio module 420.
- step 940 the controller 210 performs voice recognition through the first voice recognition processor 170.
- FIG. 10 is a flowchart illustrating a process of performing voice recognition by a controller through a first voice recognition processor or a second voice recognition processor according to various embodiments of the present disclosure.
- step 1000 the controller 210 determines whether a voice is input and proceeds to step 1010 when a voice is input, and proceeds to step 720 if a voice is not input.
- step 1010 the controller 210 performs voice recognition through the second voice recognition processor 220.
- step 1020 the controller 210 deactivates the second voice recognition processor 220, and activates the first voice recognition processor 170 in step 730.
- the activation means supplying power to the first voice recognition processor 170 in the dormant state to switch to a state capable of performing a voice recognition operation of the first voice recognition processor 170. .
- activation means a state in which the voice recognition operation of the first voice recognition processor 110 in the first voice recognition processor 170 can be performed.
- the controller 210 performs voice recognition through the first voice recognition processor 170.
- FIG. 11 is a flowchart illustrating a process of performing voice recognition by a controller through a first voice recognition processor or a second processor according to various embodiments of the present disclosure.
- step 1100 the controller 210 determines whether an application for outputting audio is running, and if the application for outputting audio is running, proceeds to step 1110, and if the application for outputting audio is not executed. Proceed to step 1120.
- the controller 210 may determine that sound is output to the speaker when an application for outputting audio is running.
- step 1110 the controller 210 performs voice recognition through the second voice recognition processor 220.
- step 1120 the controller 210 deactivates the second voice recognition processor 220, and activates the first voice recognition processor 170 in step 1130.
- step 1140 the controller 210 performs voice recognition through the activated first voice recognition processor 170.
- FIG. 12 is a flowchart illustrating a process of performing voice recognition by a controller through a first voice recognition processor or a second voice recognition processor according to various embodiments of the present disclosure.
- step 1200 the controller 210 determines whether the audio module 420 is activated, and proceeds to step 1210 when the audio module 420 is activated, and when the audio module 420 is not activated, Proceed to step 1220.
- activation of the audio module 420 may mean a state in which the audio module 420 is operating.
- the controller 210 performs voice recognition through the second voice recognition processor 220.
- the controller 210 deactivates the second voice recognition processor 220, and activates the first processor 100 in operation 1230.
- step 1240 the controller 210 performs voice recognition through the first voice recognition processor 170.
- the controller 210 receives a voice of "Hi Galaxy” from the microphone 400, and then receives a specific voice recognition processor. Can be activated. Thereafter, the controller 210 may perform additional speech recognition or stop or start the operation of the specific speech recognition processor using the activated speech recognition processor.
- a voice may be recognized by the first voice recognition unit 110 of the first voice recognition processor 170 or by the second voice recognition unit 220 of the second voice recognition processor 180.
- the controller 210 performing voice recognition through the first voice recognition processor 170 determines whether audio is output to the speaker 410, and when the audio is output to the speaker 410, the first voice recognition processor 170. May be deactivated and the second voice recognition processor 220 may be activated. According to various embodiments of the present disclosure, the controller 210 may determine whether audio is output to the speaker 410 to determine whether the music reproduction application is operated or the audio module 420 is activated.
- the second preprocessor 221 performs signal processing such as AEC to suppress distortion of the input voice, and delivers the purified voice to the second voice recognition unit 223.
- FIG. 13 illustrates a process of performing voice recognition through a first voice recognition processor, a second voice recognition processor, and a third voice recognition processor according to various embodiments.
- the first voice recognition processor 170 and the second voice recognition processor 180 may receive the first voice from the microphone 400.
- the first voice recognition processor 170 recognizes the first voice, determines whether the first voice includes the first command, and if the first voice includes the first command, proceeds to step 1302. If the first command is not included, the first voice recognition processor 170 ends the voice recognition.
- the first voice recognition processor 170 determines whether the second voice is received, and if the second voice is received, proceeds to step 1303. If the second voice is not received, the first voice recognition processor 170 ends the voice recognition.
- the first voice recognition processor 170 transfers the received second voice to the third voice recognition processor 190 and ends the voice recognition. Accordingly, after receiving and recognizing the second voice, the third voice recognition processor 190 may transfer the recognition result to the first voice recognition processor 170 or the second voice recognition processor 180. The 170 or the second voice recognition processor 180 may perform an operation corresponding to the recognition result.
- the second voice recognition processor 180 recognizes the first voice, determines whether the first voice includes the first command, and if the first voice includes the first command, proceeds to step 1305, and the first voice. If the first command is not included, voice recognition ends.
- the second voice recognition processor 180 determines whether the second voice is received, proceeds to step 1306 when the second voice is received, and ends the voice recognition when the second voice is not received.
- the second voice recognition processor 180 recognizes the received second voice. If the second voice is included in the second voice, the second voice recognition processor 180 may proceed to step 1307 to perform an operation corresponding to the second command.
- a method using an electronic device may include: obtaining, by at least one of the first and second voice recognition devices, a first voice; Recognizing, through an external electronic device, a second voice that is additionally recognized when a predetermined command is included in the first voice acquired by the first voice recognition device; Recognizing a second voice that is additionally recognized when a predetermined command is included in the first voice acquired by the second voice recognition device; And an associated operation based on the recognized second voice.
- FIG. 14 illustrates a process of performing voice recognition through a first voice recognition processor, a second voice recognition processor, and a third voice recognition processor according to various embodiments.
- the first voice recognition processor 170 may receive a first voice from the microphone 400.
- the first voice recognition processor 170 recognizes the first voice, determines whether the first voice includes the first command, and if the first voice includes the first command, proceeds to step 1402. If the first command is not included, operation 1404 may be performed.
- the first voice recognition processor 170 determines whether the second voice is received, and if the second voice is received, proceeds to step 1303. If the second voice is not received, the first voice recognition processor 170 ends the voice recognition.
- the first voice recognition processor 170 transfers the received second voice to the third voice recognition processor 190 and ends the voice recognition. Accordingly, after receiving and recognizing the second voice, the third voice recognition processor 190 may transfer the recognition result to the first voice recognition processor 170 or the second voice recognition processor 180. The 170 or the second voice recognition processor 180 may perform an operation corresponding to the recognition result.
- the second voice recognition processor 180 recognizes the first voice, determines whether the first voice includes the first command, and if the first voice includes the first command, proceeds to step 1405. If the first command is not included, voice recognition ends.
- the second voice recognition processor 180 determines whether the second voice is received, proceeds to step 1406 when the second voice is received, and ends the voice recognition when the second voice is not received.
- the second voice recognition processor 180 recognizes the received second voice. If the second voice is included in the second voice, the second voice recognition processor 180 may proceed to step 1407 and perform an operation corresponding to the second command.
- FIG. 15 illustrates a process of performing voice recognition through a first voice recognition processor, a second voice recognition processor, and a third voice recognition processor according to various embodiments.
- the first voice recognition processor 170 may execute a specific application.
- the first voice recognition processor 170 may receive a first voice from the microphone 400.
- the first voice recognition processor 170 may determine whether voice recognition is possible with respect to an application being executed, and if the voice recognition is possible, proceed to step 1503, and if it is not possible, proceed to step 1507.
- the first voice recognition processor 170 recognizes the first voice, determines whether the first voice includes the first command, and if the first voice includes the first command, proceeds to step 1504. If the first command is not included, operation 1505 may be performed.
- the first voice recognition processor 170 may transfer the received second voice to the third voice recognition processor 190.
- the first voice recognition processor 170 recognizes the first voice, determines whether the first voice includes the third command, and if the first voice includes the third command, proceeds to step 1506. If the third command is not included, voice recognition ends.
- the first voice recognition processor 170 may perform an operation corresponding to the third command.
- the second voice recognition processor 180 may perform voice recognition on the running application.
- the second voice recognition processor 180 recognizes the first voice, determines whether the first voice includes the first command, and if the first voice includes the first command, proceeds to step 1509. If the first command is not included, steps 1505 and 1506 may be performed in step a.
- the second voice recognition processor 180 determines whether the second voice includes the second command. If the second voice includes the second command, the second voice recognition processor 180 corresponds to the second command in step 1510. Can be performed.
- FIG. 16 illustrates a process of upgrading a speech recognition model through a third speech recognition processor according to various embodiments.
- the third voice recognition processor 190 recognizes the second voice.
- the third voice recognition processor 190 determines whether a command related to the second voice is present among preset commands, and if the command related to the second voice exists, proceeds to step 1602. If it does not exist, the process proceeds to step 1603. For example, when the recognized second voice is “Begin,” the third voice recognition processor 190 may determine whether a command related to “Begin” and / or a command similar to “Begin” exists.
- the third voice recognition processor 190 may update the second voice recognition model storage unit 222 by matching the corresponding command with the recognized second voice.
- the third voice recognition processor 190 may have a “Start” related to the recognized “Begin” and / or similar commands. If it is determined that the recognized "Begin” and "Start” correspond to the second voice recognition model storage unit 222 may be updated. In other words, the third voice recognition processor 190 may add and store “Begin” as well as “Start” to a command for starting video playback among the video player application functions capable of playing video.
- the third voice recognition processor 190 determines whether a device function related to the second voice exists, and proceeds to 1604 when the device function related to the second voice exists. 2 If there is no device function related to voice, the command update operation ends. For example, when the video player application is running and the second voice is "stop", the third voice recognition processor 190 may determine whether there is a video player function associated with stopping.
- the third voice recognition processor 190 may update the second voice recognition model storage unit 222 by matching the corresponding device function with the recognized second voice. For example, when the video player function related to “stop” is “play stop”, the third voice recognition processor 190 may set and store “stop” as a command for performing the “play stop” function.
- the first voice recognition processor, the second voice recognition processor, and the third voice recognition processor may perform voice recognition even when an application is executed and / or when the electronic device is in a sleep mode.
- the first voice recognition processor, the second voice recognition processor, and the third voice recognition processor may recognize only a wake-up command (eg, “Hi Galaxy”) in the sleep mode. The command can be recognized.
- the first voice recognition processor 170 and / or the second voice recognition processor 180 may perform an application capable of natural language voice recognition and receive the received word. Recognized “Hi Galaxy”. Thereafter, when the “open camera” is received, the first voice recognition processor 170 transmits the input “open camera” to the third voice recognition processor 190, and when the recognition result is received by the third voice recognition processor 190. The camera application can be executed according to the recognition result. In addition, the second voice recognition processor 180 may recognize the received “open camera” and execute the camera application.
- the first voice recognition processor 170 and / or the second voice recognition processor 180 may perform a natural language voice.
- the application can recognize and recognize the received “Hi Galaxy”.
- the first voice recognition processor 170 transmits the input “open camera” to the third voice recognition processor 190, and when the recognition result is received by the third voice recognition processor 190.
- the camera application can be executed according to the recognition result.
- the second voice recognition processor 180 may recognize the received “open camera” and execute the camera application.
- the second voice recognition processor 180 may recognize the same and perform a function of a related music player application.
- 17 is a block diagram 1700 of an electronic device 1701 according to various embodiments.
- the electronic device 1701 may configure, for example, all or part of the electronic device 101 shown in FIG. 1.
- the electronic device 1701 may include at least one application processor (AP) 1710, a communication module 1720, a subscriber identification module (SIM) card 1724, a memory 1730, and a sensor.
- AP application processor
- SIM subscriber identification module
- the AP 1710 may control a plurality of hardware or software components connected to the AP 1710 by operating an operating system or an application program, and may perform various data processing and operations including multimedia data.
- the AP 1710 may be implemented with, for example, a system on chip (SoC).
- SoC system on chip
- the AP 1710 may further include a graphic processing unit (GPU).
- GPU graphic processing unit
- the communication module 1720 may be connected to other electronic devices (eg, the electronic device 104) connected to the electronic device 1701 (eg, the electronic device 101) through a network. Alternatively, data transmission and reception may be performed in communication between the servers 106. According to an embodiment, the communication module 1720 may include a cellular module 1721, a Wifi module 1723, a BT module 1725, a GPS module 1727, an NFC module 1728, and a radio frequency (RF) module ( 1729).
- RF radio frequency
- the cellular module 1721 may provide a voice call, a video call, a text service, or an Internet service through a communication network (eg, LTE, LTE-A, CDMA, WCDMA, UMTS, WiBro, or GSM).
- a communication network eg, LTE, LTE-A, CDMA, WCDMA, UMTS, WiBro, or GSM.
- the cellular module 1721 may perform identification and authentication of an electronic device in a communication network using, for example, a subscriber identification module (eg, the SIM card 1724).
- the cellular module 1721 may perform at least some of the functions that the AP 1710 may provide.
- the cellular module 1721 may perform at least part of a multimedia control function.
- the cellular module 1721 may include a Communication Processor (CP).
- the cellular module 1721 may be implemented with, for example, an SoC.
- components of the cellular module 1721 eg, a communication processor), the memory 1730, or the power management module 1795 are illustrated as separate components from the AP 1710.
- the AP 1710 may be implemented to include at least some of the aforementioned components (eg, the cellular module 1721).
- the AP 1710 or the cellular module 1721 may load instructions or data received from at least one of nonvolatile memory or other components connected to the volatile memory. load) to process it.
- the AP 1710 or the cellular module 1721 may store data received from at least one of the other components or generated by at least one of the other components in a nonvolatile memory.
- Each of the Wifi module 1723, the BT module 1725, the GPS module 1725, or the NFC module 1728 may include, for example, a processor for processing data transmitted and received through a corresponding module. Can be.
- the cellular module 1721, the Wifi module 1723, the BT module 1725, the GPS module 1725, or the NFC module 1728 are shown as separate blocks, but according to an embodiment, the cellular module ( 1721, Wifi module 1723, BT module 1725, GPS module 1725, or NFC module 1728 (eg, two or more) may be included in one integrated chip (IC) or IC package. have.
- the processors corresponding to each of the cellular module 1721, the Wifi module 1723, the BT module 1725, the GPS module 1725, or the NFC module 1728 may be implemented as one SoC.
- the RF module 1729 may transmit and receive data, for example, an RF signal.
- the RF module 1729 may include, for example, a transceiver, a power amplifier module (PAM), a frequency filter, a low noise amplifier (LNA), or the like.
- the RF module 1729 may further include a component for transmitting / receiving an electromagnetic wave in a free space in wireless communication, for example, a conductor or a conductor.
- the cellular module 1721, the Wifi module 1723, the BT module 1725, the GPS module 1725, and the NFC module 1728 are shown to share a single RF module 1729.
- At least one of the cellular module 1721, the Wifi module 1723, the BT module 1725, the GPS module 1725, or the NFC module 1728 performs transmission and reception of RF signals through a separate RF module. can do.
- the SIM card 1724 may be a card including a subscriber identification module and may be inserted into a slot formed at a specific position of the electronic device.
- the SIM card 1724 may include unique identification information (eg, an integrated circuit card identifier (ICCID)) or subscriber information (eg, an international mobile subscriber identity (IMSI)).
- ICCID integrated circuit card identifier
- IMSI international mobile subscriber identity
- the memory 1730 may include an internal memory 1732 or an external memory 1734.
- the internal memory 1732 may be, for example, a volatile memory (for example, a dynamic RAM (DRAM), a static RAM (SRAM), a synchronous dynamic RAM (SDRAM), etc.) or a non-volatile memory (for example).
- a volatile memory for example, a dynamic RAM (DRAM), a static RAM (SRAM), a synchronous dynamic RAM (SDRAM), etc.
- a non-volatile memory for example.
- OTPROM one time programmable ROM
- PROM programmable ROM
- EPROM erasable and programmable ROM
- EEPROM electrically erasable and programmable ROM
- mask ROM mask ROM
- flash ROM NAND flash memory
- NOR flash memory etc. It may include at least one.
- the internal memory 1732 may be a solid state drive (SSD).
- the external memory 1734 may be a flash drive, for example, compact flash (CF), secure digital (SD), micro secure digital (Micro-SD), mini secure digital (mini-SD), extreme digital (XD), or the like. It may further include a Memory Stick.
- the external memory 1734 may be functionally connected to the electronic device 1701 through various interfaces.
- the electronic device 1701 may further include a storage device (or a storage medium) such as a hard drive.
- the sensor module 1740 may measure a physical quantity or detect an operation state of the electronic device 1701, and may convert the measured or detected information into an electrical signal.
- the sensor module 1740 may include, for example, a gesture sensor 1740A, a gyro sensor 1740B, an air pressure sensor 1740C, a magnetic sensor 1740D, an acceleration sensor 1740E, a grip sensor 1740F, and a proximity sensor. (1740G), color sensor 1740H (e.g. RGB (red, green, blue) sensor), biometric sensor 1740I, temperature / humidity sensor 1740J, illuminance sensor 1740K, or UV (ultra violet) sensor ( 1740M).
- the sensor module 1740 may include, for example, an olfactory sensor (E-nose sensor, not shown), an EMG sensor (electromyography sensor, not shown), an EEG sensor (electroencephalogram sensor, not shown), an ECG.
- the sensor may include an electrocardiogram sensor (not shown), an infrared (IR) sensor (not shown), an iris sensor (not shown), or a fingerprint sensor (not shown).
- the sensor module 1740 may further include a control circuit for controlling at least one or more sensors belonging therein.
- the input device 1750 may include a touch panel 1552, a (digital) pen sensor 1754, a key 1756, or an ultrasonic input device 1758. Can be.
- the touch panel 1722 may recognize a touch input by at least one of capacitive, resistive, infrared, or ultrasonic methods.
- the touch panel 1702 may further include a control circuit. In the case of the capacitive type, physical contact or proximity recognition is possible.
- the touch panel 1702 may further include a tactile layer. In this case, the touch panel 1722 may provide a tactile response to the user.
- the (digital) pen sensor 1754 may be implemented using, for example, a method identical or similar to a method of receiving a user's touch input or using a separate recognition sheet.
- the key 1756 may include, for example, a physical button, an optical key or a keypad.
- the ultrasonic input device 1758 is a device that can detect data by detecting sound waves with a microphone (for example, a microphone 1788) in the electronic device 1701 through an input tool for generating an ultrasonic signal. Recognition is possible.
- the electronic device 1701 may receive a user input from an external device (for example, a computer or a server) connected thereto using the communication module 1720.
- the display 1760 may include a panel 1762, a hologram device 1764, or a projector 1766.
- the panel 1762 may be, for example, a liquid-crystal display (LCD) or an active-matrix organic light-emitting diode (AM-OLED).
- the panel 1762 may be implemented to be, for example, flexible, transparent, or wearable.
- the panel 1762 may be configured as a single module together with the touch panel 1702.
- the hologram device 1764 may show a stereoscopic image in the air by using interference of light.
- the projector 1766 may display an image by projecting light onto a screen.
- the screen may be located inside or outside the electronic device 1701.
- the display 1760 may further include a control circuit for controlling the panel 1762, the hologram device 1764, or the projector 1766.
- the interface 1770 may be, for example, a high-definition multimedia interface (HDMI) 1772, a universal serial bus (USB) 1774, an optical interface 1776, or a D-subminiature D-sub. 1778).
- the interface 1770 may be included in, for example, the communication interface 160 illustrated in FIG. 1.
- the interface 1770 may be, for example, a mobile high-definition link (MHL) interface, a secure digital (SD) card / multi-media card (MMC) interface, or an infrared data association (IrDA) compliant interface. It may include.
- MHL mobile high-definition link
- SD secure digital
- MMC multi-media card
- IrDA infrared data association
- the audio module 1780 may bidirectionally convert a sound and an electric signal. At least some components of the audio module 1780 may be included in, for example, the input / output interface 140 illustrated in FIG. 1.
- the audio module 1780 may process sound information input or output through, for example, a speaker 1762, a receiver 1784, an earphone 1868, a microphone 1788, or the like.
- the camera module 1791 is a device capable of capturing a still image and a moving image.
- at least one image sensor eg, a front sensor or a rear sensor
- a lens not shown
- ISP image signal processor
- flash eg, LED or xenon lamp
- the power management module 1795 may manage power of the electronic device 1701. Although not shown, the power management module 1795 may include, for example, a power management integrated circuit (PMIC), a charger integrated circuit (IC), or a battery or fuel gauge.
- PMIC power management integrated circuit
- IC charger integrated circuit
- battery or fuel gauge a battery or fuel gauge
- the PMIC may be mounted in, for example, an integrated circuit or an SoC semiconductor.
- Charging methods may be divided into wired and wireless.
- the charger IC may charge a battery and prevent overvoltage or overcurrent from flowing from a charger.
- the charger IC may include a charger IC for at least one of the wired charging method and the wireless charging method.
- Examples of the wireless charging method include a magnetic resonance method, a magnetic induction method, an electromagnetic wave method, and the like, and additional circuits for wireless charging, such as a coil loop, a resonant circuit, or a rectifier, may be added. have.
- the battery gauge may measure, for example, the remaining amount of the battery 1796, a voltage, a current, or a temperature during charging.
- the battery 1796 may store or generate electricity, and supply power to the electronic device 1701 using the stored or generated electricity.
- the battery 1796 may include, for example, a rechargeable battery or a solar battery.
- the indicator 1797 may display a specific state of the electronic device 1701 or a portion thereof (for example, the AP 1710), for example, a booting state, a message state, or a charging state.
- the motor 1798 may convert electrical signals into mechanical vibrations.
- the electronic device 1701 may include a processing device (eg, a GPU) for supporting mobile TV.
- the processing apparatus for supporting mobile TV may process media data according to a standard such as digital multimedia broadcasting (DMB), digital video broadcasting (DVB), or media flow.
- DMB digital multimedia broadcasting
- DVD digital video broadcasting
- Each of the above-described elements of the electronic device according to various embodiments of the present disclosure may be composed of one or more components, and the name of the corresponding element may vary according to the type of the electronic device.
- An electronic device according to various embodiments of the present disclosure may be configured to include at least one of the above-described components, and some components may be omitted or further include other additional components.
- some of the components of the electronic device according to various embodiments of the present disclosure may be combined to form one entity, and thus may perform the same functions of the corresponding components before being combined.
- module used in various embodiments of the present invention may mean, for example, a unit including one or a combination of two or more of hardware, software, or firmware.
- the term “module” may be interchangeably used with terms such as, for example, unit, logic, logical block, component, or circuit.
- the module may be a minimum unit or part of an integrally constructed part.
- the module may be a minimum unit or part of performing one or more functions.
- the module may be implemented mechanically or electronically.
- a “module” in accordance with various embodiments of the present invention may be an application-specific integrated circuit (ASIC) chip, field-programmable gate arrays (FPGAs), or programmable logic that perform certain operations, known or developed in the future. It may include at least one of a programmable-logic device.
- ASIC application-specific integrated circuit
- FPGAs field-programmable gate arrays
- an apparatus eg, modules or functions thereof
- method eg, operations
- an apparatus eg, modules or functions thereof
- method eg, operations
- at least a portion of an apparatus may be, for example, computer readable in the form of a programming module. It may be implemented by instructions stored in a computer-readable storage media. When the command is executed by one or more processors (for example, the processor 210), the one or more processors may perform a function corresponding to the command.
- the computer-readable storage medium may be, for example, the memory 130.
- At least a part of the programming module may be implemented (for example, executed) by, for example, the processor 210.
- At least some of the programming modules may include, for example, modules, programs, routines, sets of instructions, or processes for performing one or more functions.
- the computer-readable recording medium includes magnetic media such as hard disks, floppy disks, and magnetic tapes, and optical recording media such as compact disc read only memory (CD-ROM) and digital versatile disc (DVD).
- Media magnetic-optical media such as floppy disks, program instructions such as read only memory, random access memory, flash memory, and the like (e.g., programming modules).
- Hardware devices specifically configured to store and perform < RTI ID 0.0 >
- Program instructions may also include high-level language code that can be executed by a computer using an interpreter as well as machine code such as produced by a compiler.
- the hardware device described above may be configured to operate as one or more software modules to perform the operations of the various embodiments of the present invention, and vice versa.
- Modules or programming modules may include at least one or more of the aforementioned components, omit some of them, or further include additional components.
- Operations performed by modules, programming modules, or other components in accordance with various embodiments of the present invention may be executed in a sequential, parallel, repetitive, or heuristic manner. In addition, some operations may be executed in a different order, omitted, or other operations may be added.
- a storage medium storing instructions, wherein the instructions are configured to cause the at least one processor to perform at least one operation when executed by the at least one processor, wherein the at least one operation is Acquiring, by at least one of the first voice recognition device and the second voice recognition device, the first voice; Recognizing, through an external electronic device, a second voice that is additionally recognized when a predetermined command is included in the first voice acquired by the first voice recognition device; Recognizing a second voice that is additionally recognized when a predetermined command is included in the first voice acquired by the second voice recognition device; And performing an associated operation based on the recognized second voice.
- a storage medium storing instructions, wherein the instructions are configured to cause the at least one processor to perform at least one operation when executed by the at least one processor, wherein the at least one operation is Executing any application; Acquiring a first voice by at least one of the first voice recognition device and the second voice recognition device with respect to the application; Recognizing, through the external electronic device, a second voice that is additionally recognized when a predetermined command is included in the first voice acquired by the first voice recognition device; And recognizing, through the external electronic device, a second voice additionally recognized when a predetermined command is included in the first voice acquired by the second voice recognition device.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- User Interface Of Digital Computer (AREA)
- Telephone Function (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
Claims (18)
- 전자 장치를 이용한 방법에 있어서,제1 음성 인식 장치 또는 제2 음성 인식 장치 중 적어도 하나의 장치가 제1 음성을 획득하는 동작;상기 제1 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우 추가적으로 인식되는 제2 음성을 외부 전자 장치를 통해 인식하는 동작;상기 제2 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우, 추가적으로 인식되는 제2 음성을 인식하는 동작; 및상기 인식된 제2 음성을 기반으로 연관된 동작을 수행하는 동작을 포함하는 방법.
- 제1항에 있어서,상기 외부 전자 장치에 의해서 인식된 제2 음성을 기반으로 연관된 동작을 수행하는 동작을 더 포함하는 방법.
- 제1항에 있어서,상기 제1 음성 인식 장치에서 획득한 제1 음성에 상기 명령어가 포함되지 않는 경우 상기 제2 음성 인식 장치에서 상기 제1 음성을 획득하는 동작;상기 제2 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함되면 추가적으로 인식되는 제2 음성을 인식하는 동작; 및상기 인식된 제2 음성을 기반으로 연관된 동작을 수행하는 동작을 포함하는 방법.
- 제3항에 있어서,상기 제2 음성 인식 장치에서 획득한 제2 음성에 대한 인식을 실패하면 상기 제2 음성을 상기 제3 음성 인식 장치를 통해서 인식하는 동작;상기 제3 음성 인식 장치에 의해서 인식된 제2 음성을 기반으로 연관된 동작을 수행하는 동작을 포함하는 방법.
- 전자 장치를 이용한 방법에 있어서,임의의 어플리케이션을 실행하는 동작;상기 어플리케이션에 대해서 상기 제1 음성 인식 장치 또는 상기 제2 음성 인식 장치 중 적어도 하나의 장치가 제1 음성을 획득하는 동작;상기 제1 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우 추가적으로 인식되는 제2 음성을 상기 외부 전자 장치를 통해서 인식하는 동작; 및상기 제2 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우 추가적으로 인식되는 제2 음성을 상기 외부 전자 장치를 통해서 인식하는 동작을 포함하는 방법.
- 제5항에 있어서,상기 제2 음성 인식 장치에서 획득한 제1 음성에 상기 명령어가 포함되지 않고, 다른 명령어가 포함된 경우 상기 다른 명령어를 기반으로 관련된 동작을 수행하는 동작을 포함하는 방법.
- 제6항에 있어서,상기 제3 음성 인식 장치에서 상기 제2 음성을 인식하는 동작;미리 설정된 명령어 셋(set) 중 상기 제2 음성에 연관된 명령어가 포함된 경우 상기 제2 음성을 기반하여 상기 명령어 셋을 업데이트하는 동작을 포함하는 방법.
- 제7항에 있어서,상기 명령어 셋 중 상기 제2 음성에 연관된 명령어가 포함되지 않으면 상기 제2 음성에 연관된 명령어를 상기 명령어 셋에 업데이트하는 동작을 포함하는 방법.
- 전자 장치에 있어서,제1 음성을 획득하는 제1 음성 인식 장치 또는 제2 음성 인식 장치 중 적어도 하나를 포함하고,상기 제1 음성 장치에서 상기 제1 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우 추가적으로 인식되는 제2 음성을 외부 전자 장치를 통해 인식하고, 상기 제2 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우, 추가적으로 인식되는 제2 음성을 인식한 후 상기 인식된 제2 음성을 기반으로 연관된 동작을 수행하도록 설정된 전자 장치.
- 제9항에 있어서, 제1 음성 인식 장치 또는 제2 음성 인식 장치 중 적어도 하나는,상기 외부 전자 장치에 의해서 인식된 제2 음성을 기반으로 연관된 동작을 수행하도록 설정된 전자 장치.
- 제9항에 있어서,상기 제1 음성 인식 장치에서 획득한 제1 음성에 상기 명령어가 포함되지 않는 경우 상기 제2 음성 인식 장치에서 상기 제1 음성을 획득하고, 상기 제2 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함되면 추가적으로 인식되는 제2 음성을 인식한 후 상기 인식된 제2 음성을 기반으로 연관된 동작을 수행하도록 설정된 전자 장치.
- 제11항에 있어서,상기 제2 음성 인식 장치에서 획득한 제2 음성에 대한 인식을 실패하면 상기 제2 음성을 상기 제3 음성 인식 장치를 통해서 인식하고, 상기 제3 음성 인식 장치에 의해서 인식된 제2 음성을 기반으로 연관된 동작을 수행하도록 설정된 전자 장치.
- 전자 장치에 있어서,임의의 어플리케이션을 실행되면 상기 어플리케이션에 대해서 제1 음성을 획득하는 상기 제1 음성 인식 장치 또는 상기 제2 음성 인식 장치 중 적어도 하나를 포함하고,상기 제1 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우 추가적으로 인식되는 제2 음성을 상기 외부 전자 장치를 통해서 인식하고, 상기 제2 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우 추가적으로 인식되는 제2 음성을 상기 외부 전자 장치를 통해서 인식하도록 설정된 전자 장치.
- 제13항에 있어서,상기 제2 음성 인식 장치에서 획득한 제1 음성에 상기 명령어가 포함되지 않고, 다른 명령어가 포함된 경우 상기 다른 명령어를 기반으로 관련된 동작을 수행하도록 설정된 전자 장치.
- 제14항에 있어서,상기 제3 음성 인식 장치에서 상기 제2 음성을 인식하고, 미리 설정된 명령어 셋(set) 중 상기 제2 음성에 연관된 명령어가 포함된 경우 상기 제2 음성을 기반하여 상기 명령어 셋을 업데이트하도록 설정된 전자 장치.
- 제15항에 있어서,상기 명령어 셋 중 상기 제2 음성에 연관된 명령어가 포함되지 않으면 상기 제2 음성에 연관된 명령어를 상기 명령어 셋에 업데이트하도록 설정된 전자 장치.
- 컴퓨터로 판독 가능한, 명령어들을 저장하고 있는 기록 매체에 있어서,제1 음성 인식 장치 또는 제2 음성 인식 장치 중 적어도 하나의 장치가 제1 음성을 획득하는 동작;상기 제1 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우 추가적으로 인식되는 제2 음성을 외부 전자 장치를 통해 인식하는 동작;상기 제2 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우, 추가적으로 인식되는 제2 음성을 인식하는 동작; 및상기 인식된 제2 음성을 기반으로 연관된 동작을 수행하는 동작을 실행시키기 위한 프로그램을 기록한 컴퓨터 판독 가능한 기록 매체.
- 컴퓨터로 판독 가능한, 명령어들을 저장하고 있는 기록 매체에 있어서,임의의 어플리케이션을 실행하는 동작;상기 어플리케이션에 대해서 상기 제1 음성 인식 장치 또는 상기 제2 음성 인식 장치 중 적어도 하나의 장치가 제1 음성을 획득하는 동작;상기 제1 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우 추가적으로 인식되는 제2 음성을 상기 외부 전자 장치를 통해서 인식하는 동작; 및상기 제2 음성 인식 장치에서 획득한 제1 음성에 미리 정해진 명령어가 포함될 경우 추가적으로 인식되는 제2 음성을 상기 외부 전자 장치를 통해서 인식하는 동작을 실행시키기 위한 프로그램을 기록한 컴퓨터 판독 가능한 기록 매체.
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/915,068 US10192557B2 (en) | 2013-08-26 | 2014-08-26 | Electronic device and method for voice recognition using a plurality of voice recognition engines |
EP14840410.6A EP3040985B1 (en) | 2013-08-26 | 2014-08-26 | Electronic device and method for voice recognition |
CN201480047495.1A CN105493180B (zh) | 2013-08-26 | 2014-08-26 | 用于语音识别的电子装置和方法 |
KR1020167007691A KR102394485B1 (ko) | 2013-08-26 | 2014-08-26 | 음성 인식을 위한 전자 장치 및 방법 |
US16/259,506 US11158326B2 (en) | 2013-08-26 | 2019-01-28 | Electronic device and method for voice recognition using a plurality of voice recognition devices |
US17/509,403 US20220044690A1 (en) | 2013-08-26 | 2021-10-25 | Electronic device and method for voice recognition |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR20130101411 | 2013-08-26 | ||
KR10-2013-0101411 | 2013-08-26 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/915,068 A-371-Of-International US10192557B2 (en) | 2013-08-26 | 2014-08-26 | Electronic device and method for voice recognition using a plurality of voice recognition engines |
US16/259,506 Continuation US11158326B2 (en) | 2013-08-26 | 2019-01-28 | Electronic device and method for voice recognition using a plurality of voice recognition devices |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015030474A1 true WO2015030474A1 (ko) | 2015-03-05 |
Family
ID=52586943
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2014/007951 WO2015030474A1 (ko) | 2013-08-26 | 2014-08-26 | 음성 인식을 위한 전자 장치 및 방법 |
Country Status (5)
Country | Link |
---|---|
US (3) | US10192557B2 (ko) |
EP (1) | EP3040985B1 (ko) |
KR (1) | KR102394485B1 (ko) |
CN (1) | CN105493180B (ko) |
WO (1) | WO2015030474A1 (ko) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150379992A1 (en) * | 2014-06-30 | 2015-12-31 | Samsung Electronics Co., Ltd. | Operating method for microphones and electronic device supporting the same |
WO2017209333A1 (ko) * | 2016-06-02 | 2017-12-07 | 엘지전자 주식회사 | 홈 오토메이션 시스템 및 그 제어방법 |
Families Citing this family (131)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US10706373B2 (en) | 2011-06-03 | 2020-07-07 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
CN113470641B (zh) | 2013-02-07 | 2023-12-15 | 苹果公司 | 数字助理的语音触发器 |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
DE112014002747T5 (de) | 2013-06-09 | 2016-03-03 | Apple Inc. | Vorrichtung, Verfahren und grafische Benutzerschnittstelle zum Ermöglichen einer Konversationspersistenz über zwei oder mehr Instanzen eines digitalen Assistenten |
CN105453026A (zh) | 2013-08-06 | 2016-03-30 | 苹果公司 | 基于来自远程设备的活动自动激活智能响应 |
CN105493180B (zh) * | 2013-08-26 | 2019-08-30 | 三星电子株式会社 | 用于语音识别的电子装置和方法 |
KR102179506B1 (ko) * | 2013-12-23 | 2020-11-17 | 삼성전자 주식회사 | 전자장치 및 그 제어방법 |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
TWI566107B (zh) | 2014-05-30 | 2017-01-11 | 蘋果公司 | 用於處理多部分語音命令之方法、非暫時性電腦可讀儲存媒體及電子裝置 |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
FR3030177B1 (fr) * | 2014-12-16 | 2016-12-30 | Stmicroelectronics Rousset | Dispositif electronique comprenant un module de reveil d'un appareil electronique distinct d'un coeur de traitement |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US10460227B2 (en) | 2015-05-15 | 2019-10-29 | Apple Inc. | Virtual assistant in a communication session |
US10200824B2 (en) | 2015-05-27 | 2019-02-05 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device |
US20160378747A1 (en) | 2015-06-29 | 2016-12-29 | Apple Inc. | Virtual assistant for media playback |
US10331312B2 (en) | 2015-09-08 | 2019-06-25 | Apple Inc. | Intelligent automated assistant in a media environment |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10740384B2 (en) | 2015-09-08 | 2020-08-11 | Apple Inc. | Intelligent automated assistant for media search and playback |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US9691378B1 (en) * | 2015-11-05 | 2017-06-27 | Amazon Technologies, Inc. | Methods and devices for selectively ignoring captured audio data |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10956666B2 (en) | 2015-11-09 | 2021-03-23 | Apple Inc. | Unconventional virtual assistant interactions |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10074364B1 (en) * | 2016-02-02 | 2018-09-11 | Amazon Technologies, Inc. | Sound profile generation based on speech recognition results exceeding a threshold |
US20170330566A1 (en) * | 2016-05-13 | 2017-11-16 | Bose Corporation | Distributed Volume Control for Speech Recognition |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
US10271093B1 (en) * | 2016-06-27 | 2019-04-23 | Amazon Technologies, Inc. | Systems and methods for routing content to an associated output device |
US10931999B1 (en) * | 2016-06-27 | 2021-02-23 | Amazon Technologies, Inc. | Systems and methods for routing content to an associated output device |
US20180025731A1 (en) * | 2016-07-21 | 2018-01-25 | Andrew Lovitt | Cascading Specialized Recognition Engines Based on a Recognition Policy |
KR102575634B1 (ko) * | 2016-07-26 | 2023-09-06 | 삼성전자주식회사 | 전자 장치 및 전자 장치의 동작 방법 |
US10540441B2 (en) | 2016-10-21 | 2020-01-21 | Samsung Electronics Co., Ltd. | Device and method for providing recommended words for character input |
KR102417046B1 (ko) * | 2016-10-21 | 2022-07-06 | 삼성전자주식회사 | 사용자로부터 입력된 문자에 대한 추천 단어를 제공하는 디바이스 및 방법 |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
US10748531B2 (en) * | 2017-04-13 | 2020-08-18 | Harman International Industries, Incorporated | Management layer for multiple intelligent personal assistant services |
US10580402B2 (en) * | 2017-04-27 | 2020-03-03 | Microchip Technology Incorporated | Voice-based control in a media system or other voice-controllable sound generating system |
DK201770383A1 (en) | 2017-05-09 | 2018-12-14 | Apple Inc. | USER INTERFACE FOR CORRECTING RECOGNITION ERRORS |
DK201770439A1 (en) * | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
DK180048B1 (en) | 2017-05-11 | 2020-02-04 | Apple Inc. | MAINTAINING THE DATA PROTECTION OF PERSONAL INFORMATION |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK201770427A1 (en) | 2017-05-12 | 2018-12-20 | Apple Inc. | LOW-LATENCY INTELLIGENT AUTOMATED ASSISTANT |
DK201770411A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | MULTI-MODAL INTERFACES |
US20180336275A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Intelligent automated assistant for media exploration |
US20180336892A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Detecting a trigger of a digital assistant |
US10607606B2 (en) * | 2017-06-19 | 2020-03-31 | Lenovo (Singapore) Pte. Ltd. | Systems and methods for execution of digital assistant |
KR101910385B1 (ko) * | 2017-06-22 | 2018-10-22 | 엘지전자 주식회사 | 차량에 구비된 차량 제어 장치 및 차량의 제어방법 |
GB2578386B (en) | 2017-06-27 | 2021-12-01 | Cirrus Logic Int Semiconductor Ltd | Detection of replay attack |
GB2563953A (en) | 2017-06-28 | 2019-01-02 | Cirrus Logic Int Semiconductor Ltd | Detection of replay attack |
GB201713697D0 (en) | 2017-06-28 | 2017-10-11 | Cirrus Logic Int Semiconductor Ltd | Magnetic detection of replay attack |
GB201801528D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Method, apparatus and systems for biometric processes |
GB201801527D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Method, apparatus and systems for biometric processes |
GB201801526D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Methods, apparatus and systems for authentication |
GB201801532D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Methods, apparatus and systems for audio playback |
WO2019035504A1 (ko) * | 2017-08-16 | 2019-02-21 | 엘지전자 주식회사 | 이동 단말기 및 그 제어 방법 |
KR102411766B1 (ko) * | 2017-08-25 | 2022-06-22 | 삼성전자주식회사 | 음성 인식 서비스를 활성화하는 방법 및 이를 구현한 전자 장치 |
CN107590096B (zh) * | 2017-08-31 | 2021-06-15 | 联想(北京)有限公司 | 用于电子设备中处理器的方法和处理器 |
KR20190033384A (ko) * | 2017-09-21 | 2019-03-29 | 삼성전자주식회사 | 사용자 발화를 처리하기 위한 전자 장치 및 그 전자 장치의 제어 방법 |
GB201801661D0 (en) | 2017-10-13 | 2018-03-21 | Cirrus Logic International Uk Ltd | Detection of liveness |
GB201804843D0 (en) | 2017-11-14 | 2018-05-09 | Cirrus Logic Int Semiconductor Ltd | Detection of replay attack |
GB2567503A (en) | 2017-10-13 | 2019-04-17 | Cirrus Logic Int Semiconductor Ltd | Analysing speech signals |
GB201801664D0 (en) | 2017-10-13 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Detection of liveness |
GB201801663D0 (en) | 2017-10-13 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Detection of liveness |
US10665234B2 (en) * | 2017-10-18 | 2020-05-26 | Motorola Mobility Llc | Detecting audio trigger phrases for a voice recognition session |
GB201801659D0 (en) | 2017-11-14 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Detection of loudspeaker playback |
KR102071865B1 (ko) * | 2017-11-30 | 2020-01-31 | 주식회사 인텔로이드 | 서버인식 결과를 이용하여 호출어를 인식하는 장치 및 방법 |
US11182122B2 (en) * | 2017-12-08 | 2021-11-23 | Amazon Technologies, Inc. | Voice control of computing devices |
US11264037B2 (en) | 2018-01-23 | 2022-03-01 | Cirrus Logic, Inc. | Speaker identification |
US11735189B2 (en) | 2018-01-23 | 2023-08-22 | Cirrus Logic, Inc. | Speaker identification |
US11475899B2 (en) | 2018-01-23 | 2022-10-18 | Cirrus Logic, Inc. | Speaker identification |
KR102459920B1 (ko) | 2018-01-25 | 2022-10-27 | 삼성전자주식회사 | 저전력 에코 제거를 지원하는 애플리케이션 프로세서, 이를 포함하는 전자 장치 및 그 동작 방법 |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | VIRTUAL ASSISTANT OPERATION IN MULTI-DEVICE ENVIRONMENTS |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
DK179822B1 (da) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
KR102592769B1 (ko) | 2018-07-20 | 2023-10-24 | 삼성전자주식회사 | 전자 장치 및 그의 동작 방법 |
US10692490B2 (en) | 2018-07-31 | 2020-06-23 | Cirrus Logic, Inc. | Detection of replay attack |
WO2020034104A1 (zh) * | 2018-08-14 | 2020-02-20 | 华为技术有限公司 | 一种语音识别方法、可穿戴设备及系统 |
JP7167554B2 (ja) * | 2018-08-29 | 2022-11-09 | 富士通株式会社 | 音声認識装置、音声認識プログラムおよび音声認識方法 |
US10915614B2 (en) | 2018-08-31 | 2021-02-09 | Cirrus Logic, Inc. | Biometric authentication |
US11037574B2 (en) | 2018-09-05 | 2021-06-15 | Cirrus Logic, Inc. | Speaker recognition and speaker change detection |
JP7009338B2 (ja) * | 2018-09-20 | 2022-01-25 | Tvs Regza株式会社 | 情報処理装置、情報処理システム、および映像装置 |
US11315553B2 (en) | 2018-09-20 | 2022-04-26 | Samsung Electronics Co., Ltd. | Electronic device and method for providing or obtaining data for training thereof |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
JP7202853B2 (ja) * | 2018-11-08 | 2023-01-12 | シャープ株式会社 | 冷蔵庫 |
JP7023823B2 (ja) * | 2018-11-16 | 2022-02-22 | アルパイン株式会社 | 車載装置及び音声認識方法 |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
JP7225876B2 (ja) * | 2019-02-08 | 2023-02-21 | 富士通株式会社 | 情報処理装置、演算処理装置および情報処理装置の制御方法 |
US11741529B2 (en) * | 2019-02-26 | 2023-08-29 | Xenial, Inc. | System for eatery ordering with mobile interface and point-of-sale terminal |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
DK201970509A1 (en) | 2019-05-06 | 2021-01-15 | Apple Inc | Spoken notifications |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
CN110223696B (zh) * | 2019-05-22 | 2024-04-05 | 平安科技(深圳)有限公司 | 一种语音信号的采集方法、装置及终端设备 |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
DK201970511A1 (en) | 2019-05-31 | 2021-02-15 | Apple Inc | Voice identification in digital assistant systems |
DK180129B1 (en) | 2019-05-31 | 2020-06-02 | Apple Inc. | USER ACTIVITY SHORTCUT SUGGESTIONS |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
US11468890B2 (en) | 2019-06-01 | 2022-10-11 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
CN110427097A (zh) * | 2019-06-18 | 2019-11-08 | 华为技术有限公司 | 语音数据处理方法、装置及系统 |
WO2021056255A1 (en) | 2019-09-25 | 2021-04-01 | Apple Inc. | Text detection using global geometry estimators |
KR20210066647A (ko) * | 2019-11-28 | 2021-06-07 | 삼성전자주식회사 | 전자 장치 및 이의 제어 방법 |
DE102020200067A1 (de) * | 2020-01-07 | 2021-07-08 | Robert Bosch Gesellschaft mit beschränkter Haftung | Vorrichtung und Verfahren zum Bedienen von Sprachassistenten |
KR20210136463A (ko) | 2020-05-07 | 2021-11-17 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
US11061543B1 (en) | 2020-05-11 | 2021-07-13 | Apple Inc. | Providing relevant data items based on context |
US11038934B1 (en) | 2020-05-11 | 2021-06-15 | Apple Inc. | Digital assistant hardware abstraction |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
US11490204B2 (en) | 2020-07-20 | 2022-11-01 | Apple Inc. | Multi-device audio adjustment coordination |
US11438683B2 (en) | 2020-07-21 | 2022-09-06 | Apple Inc. | User identification using headphones |
KR20220142757A (ko) * | 2021-04-15 | 2022-10-24 | 삼성전자주식회사 | 전자 장치 및 전자 장치에서 객체의 근접 여부를 판단하는 방법 |
KR20230017971A (ko) * | 2021-07-28 | 2023-02-07 | 삼성전자주식회사 | 전자 장치 및 전자 장치의 동작 방법 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6185535B1 (en) * | 1998-10-16 | 2001-02-06 | Telefonaktiebolaget Lm Ericsson (Publ) | Voice control of a user interface to service applications |
US20040034527A1 (en) * | 2002-02-23 | 2004-02-19 | Marcus Hennecke | Speech recognition system |
KR20040072691A (ko) * | 2001-12-29 | 2004-08-18 | 모토로라 인코포레이티드 | 멀티-레벨 분산 음성 인식을 위한 방법 및 장치 |
KR20120066561A (ko) * | 2010-12-14 | 2012-06-22 | (주)이엔엠시스템 | 대기 상태에서 저주파 영역 음향에 대해서 음성인식을 수행하는 음성인식 시스템 및 그 제어방법 |
KR20130083371A (ko) * | 2012-01-09 | 2013-07-22 | 삼성전자주식회사 | 영상장치 및 그 제어방법 |
Family Cites Families (61)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB9415930D0 (en) | 1994-08-04 | 1994-09-28 | Forbo Nairn Ltd | Floor coverings |
US6070140A (en) | 1995-06-05 | 2000-05-30 | Tran; Bao Q. | Speech recognizer |
US7174299B2 (en) * | 1995-08-18 | 2007-02-06 | Canon Kabushiki Kaisha | Speech recognition system, speech recognition apparatus, and speech recognition method |
US5855000A (en) * | 1995-09-08 | 1998-12-29 | Carnegie Mellon University | Method and apparatus for correcting and repairing machine-transcribed input using independent or cross-modal secondary input |
JP2002540477A (ja) * | 1999-03-26 | 2002-11-26 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | クライアント−サーバ音声認識 |
US6408272B1 (en) * | 1999-04-12 | 2002-06-18 | General Magic, Inc. | Distributed voice user interface |
JP2000322078A (ja) | 1999-05-14 | 2000-11-24 | Sumitomo Electric Ind Ltd | 車載型音声認識装置 |
WO2001001389A2 (de) | 1999-06-24 | 2001-01-04 | Siemens Aktiengesellschaft | Verfahren und vorrichtung zur spracherkennung |
US6963759B1 (en) * | 1999-10-05 | 2005-11-08 | Fastmobile, Inc. | Speech recognition technique based on local interrupt detection |
US20020046203A1 (en) * | 2000-06-22 | 2002-04-18 | The Sony Corporation/Sony Electronics Inc. | Method and apparatus for providing ratings of web sites over the internet |
FR2820872B1 (fr) * | 2001-02-13 | 2003-05-16 | Thomson Multimedia Sa | Procede, module, dispositif et serveur de reconnaissance vocale |
US7072837B2 (en) * | 2001-03-16 | 2006-07-04 | International Business Machines Corporation | Method for processing initially recognized speech in a speech recognition session |
US6738743B2 (en) * | 2001-03-28 | 2004-05-18 | Intel Corporation | Unified client-server distributed architectures for spoken dialogue systems |
JP2003241790A (ja) * | 2002-02-13 | 2003-08-29 | Internatl Business Mach Corp <Ibm> | 音声コマンド処理システム、コンピュータ装置、音声コマンド処理方法およびプログラム |
US7386454B2 (en) * | 2002-07-31 | 2008-06-10 | International Business Machines Corporation | Natural error handling in speech recognition |
US7228275B1 (en) * | 2002-10-21 | 2007-06-05 | Toyota Infotechnology Center Co., Ltd. | Speech recognition system having multiple speech recognizers |
US6834265B2 (en) | 2002-12-13 | 2004-12-21 | Motorola, Inc. | Method and apparatus for selective speech recognition |
US7392182B2 (en) * | 2002-12-18 | 2008-06-24 | Harman International Industries, Inc. | Speech recognition system |
US7418392B1 (en) * | 2003-09-25 | 2008-08-26 | Sensory, Inc. | System and method for controlling the operation of a device by voice commands |
US6889189B2 (en) * | 2003-09-26 | 2005-05-03 | Matsushita Electric Industrial Co., Ltd. | Speech recognizer performance in car and home applications utilizing novel multiple microphone configurations |
US7340395B2 (en) * | 2004-04-23 | 2008-03-04 | Sap Aktiengesellschaft | Multiple speech recognition engines |
US8589156B2 (en) * | 2004-07-12 | 2013-11-19 | Hewlett-Packard Development Company, L.P. | Allocation of speech recognition tasks and combination of results thereof |
US20060085199A1 (en) * | 2004-10-19 | 2006-04-20 | Yogendra Jain | System and method for controlling the behavior of a device capable of speech recognition |
ATE385024T1 (de) * | 2005-02-21 | 2008-02-15 | Harman Becker Automotive Sys | Multilinguale spracherkennung |
EP1796080B1 (en) * | 2005-12-12 | 2009-11-18 | Gregory John Gadbois | Multi-voice speech recognition |
US8234120B2 (en) * | 2006-07-26 | 2012-07-31 | Nuance Communications, Inc. | Performing a safety analysis for user-defined voice commands to ensure that the voice commands do not cause speech recognition ambiguities |
US8099287B2 (en) * | 2006-12-05 | 2012-01-17 | Nuance Communications, Inc. | Automatically providing a user with substitutes for potentially ambiguous user-defined speech commands |
US20110054900A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Hybrid command and control between resident and remote speech recognition facilities in a mobile voice-to-speech application |
JP5310563B2 (ja) * | 2007-12-25 | 2013-10-09 | 日本電気株式会社 | 音声認識システム、音声認識方法、および音声認識用プログラム |
US8099289B2 (en) * | 2008-02-13 | 2012-01-17 | Sensory, Inc. | Voice interface and search for electronic devices including bluetooth headsets and remote systems |
US8364481B2 (en) * | 2008-07-02 | 2013-01-29 | Google Inc. | Speech recognition with parallel recognition tasks |
KR20100032140A (ko) | 2008-09-17 | 2010-03-25 | 주식회사 현대오토넷 | 대화형 음성인식방법 및 음성인식장치 |
WO2010078386A1 (en) * | 2008-12-30 | 2010-07-08 | Raymond Koverzin | Power-optimized wireless communications device |
US8892439B2 (en) * | 2009-07-15 | 2014-11-18 | Microsoft Corporation | Combination and federation of local and remote speech recognition |
JP5545467B2 (ja) * | 2009-10-21 | 2014-07-09 | 独立行政法人情報通信研究機構 | 音声翻訳システム、制御装置、および情報処理方法 |
US8311820B2 (en) * | 2010-01-28 | 2012-11-13 | Hewlett-Packard Development Company, L.P. | Speech recognition based on noise level |
KR101699720B1 (ko) * | 2010-08-03 | 2017-01-26 | 삼성전자주식회사 | 음성명령 인식 장치 및 음성명령 인식 방법 |
US8930194B2 (en) * | 2011-01-07 | 2015-01-06 | Nuance Communications, Inc. | Configurable speech recognition system using multiple recognizers |
US8996381B2 (en) * | 2011-09-27 | 2015-03-31 | Sensory, Incorporated | Background speech recognition assistant |
US8340975B1 (en) * | 2011-10-04 | 2012-12-25 | Theodore Alfred Rosenberger | Interactive speech recognition device and system for hands-free building control |
US8972263B2 (en) * | 2011-11-18 | 2015-03-03 | Soundhound, Inc. | System and method for performing dual mode speech recognition |
US9129591B2 (en) * | 2012-03-08 | 2015-09-08 | Google Inc. | Recognizing speech in multiple languages |
US9117449B2 (en) * | 2012-04-26 | 2015-08-25 | Nuance Communications, Inc. | Embedded system for construction of small footprint speech recognition with user-definable constraints |
KR20130133629A (ko) * | 2012-05-29 | 2013-12-09 | 삼성전자주식회사 | 전자장치에서 음성명령을 실행시키기 위한 장치 및 방법 |
US9142215B2 (en) * | 2012-06-15 | 2015-09-22 | Cypress Semiconductor Corporation | Power-efficient voice activation |
US9959865B2 (en) * | 2012-11-13 | 2018-05-01 | Beijing Lenovo Software Ltd. | Information processing method with voice recognition |
US9875741B2 (en) * | 2013-03-15 | 2018-01-23 | Google Llc | Selective speech recognition for chat and digital personal assistant systems |
US9894312B2 (en) * | 2013-02-22 | 2018-02-13 | The Directv Group, Inc. | Method and system for controlling a user receiving device using voice commands |
US20140270260A1 (en) * | 2013-03-13 | 2014-09-18 | Aliphcom | Speech detection using low power microelectrical mechanical systems sensor |
CN110096712B (zh) * | 2013-03-15 | 2023-06-20 | 苹果公司 | 通过智能数字助理的用户培训 |
US20140297288A1 (en) * | 2013-03-29 | 2014-10-02 | Orange | Telephone voice personal assistant |
CN103198831A (zh) * | 2013-04-10 | 2013-07-10 | 威盛电子股份有限公司 | 语音操控方法与移动终端装置 |
US9058805B2 (en) * | 2013-05-13 | 2015-06-16 | Google Inc. | Multiple recognizer speech recognition |
CN105493180B (zh) * | 2013-08-26 | 2019-08-30 | 三星电子株式会社 | 用于语音识别的电子装置和方法 |
US9245527B2 (en) * | 2013-10-11 | 2016-01-26 | Apple Inc. | Speech recognition wake-up of a handheld portable electronic device |
US20150169285A1 (en) * | 2013-12-18 | 2015-06-18 | Microsoft Corporation | Intent-based user experience |
EP3084760A4 (en) * | 2013-12-20 | 2017-08-16 | Intel Corporation | Transition from low power always listening mode to high power speech recognition mode |
EP3100259A4 (en) * | 2014-01-31 | 2017-08-30 | Hewlett-Packard Development Company, L.P. | Voice input command |
US9378740B1 (en) * | 2014-09-30 | 2016-06-28 | Amazon Technologies, Inc. | Command suggestions during automatic speech recognition |
US9775113B2 (en) * | 2014-12-11 | 2017-09-26 | Mediatek Inc. | Voice wakeup detecting device with digital microphone and associated method |
CN107134279B (zh) * | 2017-06-30 | 2020-06-19 | 百度在线网络技术(北京)有限公司 | 一种语音唤醒方法、装置、终端和存储介质 |
-
2014
- 2014-08-26 CN CN201480047495.1A patent/CN105493180B/zh active Active
- 2014-08-26 WO PCT/KR2014/007951 patent/WO2015030474A1/ko active Application Filing
- 2014-08-26 EP EP14840410.6A patent/EP3040985B1/en active Active
- 2014-08-26 US US14/915,068 patent/US10192557B2/en active Active
- 2014-08-26 KR KR1020167007691A patent/KR102394485B1/ko active IP Right Grant
-
2019
- 2019-01-28 US US16/259,506 patent/US11158326B2/en active Active
-
2021
- 2021-10-25 US US17/509,403 patent/US20220044690A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6185535B1 (en) * | 1998-10-16 | 2001-02-06 | Telefonaktiebolaget Lm Ericsson (Publ) | Voice control of a user interface to service applications |
KR20040072691A (ko) * | 2001-12-29 | 2004-08-18 | 모토로라 인코포레이티드 | 멀티-레벨 분산 음성 인식을 위한 방법 및 장치 |
US20040034527A1 (en) * | 2002-02-23 | 2004-02-19 | Marcus Hennecke | Speech recognition system |
KR20120066561A (ko) * | 2010-12-14 | 2012-06-22 | (주)이엔엠시스템 | 대기 상태에서 저주파 영역 음향에 대해서 음성인식을 수행하는 음성인식 시스템 및 그 제어방법 |
KR20130083371A (ko) * | 2012-01-09 | 2013-07-22 | 삼성전자주식회사 | 영상장치 및 그 제어방법 |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150379992A1 (en) * | 2014-06-30 | 2015-12-31 | Samsung Electronics Co., Ltd. | Operating method for microphones and electronic device supporting the same |
KR20160001964A (ko) * | 2014-06-30 | 2016-01-07 | 삼성전자주식회사 | 마이크 운용 방법 및 이를 지원하는 전자 장치 |
US9679563B2 (en) | 2014-06-30 | 2017-06-13 | Samsung Electronics Co., Ltd. | Operating method for microphones and electronic device supporting the same |
US10062382B2 (en) | 2014-06-30 | 2018-08-28 | Samsung Electronics Co., Ltd. | Operating method for microphones and electronic device supporting the same |
US10643613B2 (en) | 2014-06-30 | 2020-05-05 | Samsung Electronics Co., Ltd. | Operating method for microphones and electronic device supporting the same |
KR102208477B1 (ko) | 2014-06-30 | 2021-01-27 | 삼성전자주식회사 | 마이크 운용 방법 및 이를 지원하는 전자 장치 |
WO2017209333A1 (ko) * | 2016-06-02 | 2017-12-07 | 엘지전자 주식회사 | 홈 오토메이션 시스템 및 그 제어방법 |
Also Published As
Publication number | Publication date |
---|---|
KR20160055162A (ko) | 2016-05-17 |
KR102394485B1 (ko) | 2022-05-06 |
US11158326B2 (en) | 2021-10-26 |
EP3040985A1 (en) | 2016-07-06 |
US20190228781A1 (en) | 2019-07-25 |
CN105493180B (zh) | 2019-08-30 |
US20160217795A1 (en) | 2016-07-28 |
US20220044690A1 (en) | 2022-02-10 |
EP3040985B1 (en) | 2023-08-23 |
US10192557B2 (en) | 2019-01-29 |
CN105493180A (zh) | 2016-04-13 |
EP3040985A4 (en) | 2017-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2015030474A1 (ko) | 음성 인식을 위한 전자 장치 및 방법 | |
WO2018135743A1 (ko) | 발화 완료 감지 방법 및 이를 구현한 전자 장치 | |
WO2016003144A1 (en) | Operating method for microphones and electronic device supporting the same | |
AU2015350680B2 (en) | Power control method and apparatus for reducing power consumption | |
WO2018174545A1 (en) | Method and electronic device for transmitting audio data to multiple external devices | |
WO2017069595A1 (en) | Electronic device and method for executing function using speech recognition thereof | |
WO2015034163A1 (en) | Method of providing notification and electronic device thereof | |
WO2017131322A1 (en) | Electronic device and speech recognition method thereof | |
WO2019164146A1 (en) | System for processing user utterance and controlling method thereof | |
WO2017131449A1 (en) | Electronic device and method for running function according to transformation of display of electronic device | |
WO2017082653A1 (en) | Electronic device and method for wireless charging in electronic device | |
WO2015126121A1 (ko) | 요청 정보에 따른 장치 제어 방법 및 이를 지원하는 장치 | |
WO2018182293A1 (en) | Method for operating speech recognition service and electronic device supporting the same | |
WO2021025350A1 (en) | Electronic device managing plurality of intelligent agents and operation method thereof | |
WO2017155326A1 (en) | Electronic device and method for driving display thereof | |
WO2017142256A1 (en) | Electronic device for authenticating based on biometric data and operating method thereof | |
WO2017082685A1 (ko) | 표시 제어 방법, 이를 구현한 디스플레이 패널, 디스플레이 장치 및 전자 장치 | |
WO2017209502A1 (ko) | 전자 장치와 충전 장치간의 연결을 제어하는 방법 및 이를 제공하는 장치 | |
WO2019004659A1 (en) | DISPLAY CONTROL METHOD AND ELECTRONIC DEVICE SUPPORTING SAID METHOD | |
WO2017119662A1 (en) | Electronic device and operating method thereof | |
WO2016039531A1 (en) | Electronic device and control method thereof | |
WO2016148505A1 (en) | Electronic apparatus and battery information providing method thereof | |
WO2016190619A1 (ko) | 전자 장치 및 게이트웨이와 그 제어 방법 | |
WO2019050242A1 (en) | ELECTRONIC DEVICE, SERVER, AND RECORDING MEDIUM SUPPORTING THE EXECUTION OF A TASK USING AN EXTERNAL DEVICE | |
WO2016099086A1 (en) | Method for providing communication service and electronic device thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201480047495.1 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14840410 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14915068 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 20167007691 Country of ref document: KR Kind code of ref document: A |
|
REEP | Request for entry into the european phase |
Ref document number: 2014840410 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2014840410 Country of ref document: EP |