WO2017206133A1 - Speech recognition method and device - Google Patents

Speech recognition method and device Download PDF

Info

Publication number
WO2017206133A1
WO2017206133A1 PCT/CN2016/084463 CN2016084463W WO2017206133A1 WO 2017206133 A1 WO2017206133 A1 WO 2017206133A1 CN 2016084463 W CN2016084463 W CN 2016084463W WO 2017206133 A1 WO2017206133 A1 WO 2017206133A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
voice information
group word
word division
module
Prior art date
Application number
PCT/CN2016/084463
Other languages
French (fr)
Chinese (zh)
Inventor
吴刚
党君利
柳义庆
冯晓龙
Original Assignee
深圳市智物联网络有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市智物联网络有限公司 filed Critical 深圳市智物联网络有限公司
Priority to PCT/CN2016/084463 priority Critical patent/WO2017206133A1/en
Publication of WO2017206133A1 publication Critical patent/WO2017206133A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present invention relates to the field of voice technologies, and in particular, to a voice recognition method and apparatus.
  • Embodiments of the present invention provide a voice recognition method capable of accurately identifying an operation instruction content in voice information when multiple voices exist.
  • the embodiment of the invention further provides a speech recognition device capable of accurately recognizing the content of the operation instruction in the speech information when there are multiple sounds.
  • the voice information matches the operation instruction template, performing an operation indicated by the voice information, and if the voice information does not match the operation execution template, the operation is not performed.
  • different operation instruction templates are set for different service interfaces, and the operation instruction template corresponding to the current service interface is used as a standard to determine whether the received voice information matches the operation instruction template, and if the matching is successful, The operation indicated by the voice information is performed, thereby avoiding the input of the approximate voice information as an operation instruction when the multiple voices exist, interrupting the service currently being provided, and accurately identifying the content of the operation instruction in the voice information.
  • the operation instruction template includes: a keyword arrangement order and a keyword vocabulary.
  • the operation instruction template in the embodiment of the present invention not only includes the keyword lexicon, but also includes the keyword arrangement order, thereby improving the standard matching with the operation instruction template, and more accurately identifying the operation instruction content in the voice information. .
  • the determining whether the voice information matches the operation instruction template include:
  • Determining whether the keywords obtained after the group word division are included in the keyword vocabulary are determined according to the splitting and combining of the keywords obtained after the group word division;
  • the keyword obtained after the group word division is included in the keyword vocabulary, it is determined whether the keyword obtained after the group word division is matched with the keyword arrangement order; if the group word division is obtained The keyword is matched with the keyword arrangement order, and the voice information is determined to match the operation instruction template; if the keyword obtained after group word division does not match the keyword arrangement order, determining the The voice information does not match the operation instruction template;
  • the voice information segmentation technology is adopted in the embodiment of the present invention, and the received voice information is divided into group words to realize the effect of accurately identifying the voice information.
  • the method further includes:
  • the keyword obtained after the group word division is not included in the keyword vocabulary, the keyword obtained by group word division not included in the keyword vocabulary is displayed;
  • the keyword when the keyword obtained after the group word division is not included in the keyword vocabulary, the keyword may be further displayed, and if the confirmation instruction is received, the execution of the keyword is continued to determine whether the keyword is The step of matching the keyword arrangement order, thereby avoiding the keyword lexical incompleteness, and making an incorrect judgment on some keywords obtained after group word division.
  • the method further includes:
  • the step of performing the judgment to perform the group word division is included in the keyword lexicon; if the group is performed The keyword obtained after the word division does not include the instruction keyword, and the voice information is determined to not match the operation instruction template.
  • An embodiment of the present invention provides a voice recognition apparatus, including:
  • a voice information receiving module configured to receive voice information
  • a determining module configured to determine whether the operation instruction template corresponding to the voice information and the current service interface matches
  • a voice information response module configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and do not perform an operation when the voice information does not match the operation execution template .
  • the operation instruction template includes: a keyword arrangement order and a keyword vocabulary.
  • the determining module includes:
  • a voice information analysis sub-module configured to perform group word division on the voice information
  • a first judgment sub-module configured to determine, according to the splitting and combining of the keywords obtained after the group word division, determine the group word division Whether the obtained keyword is included in the keyword vocabulary, and when the keyword obtained after the group word division is included in the keyword vocabulary, triggering the second determining sub-module to perform an operation, When the keyword obtained after the group word division is not included in the keyword vocabulary, it is determined that the voice information does not match the operation instruction template;
  • a second determining sub-module configured to determine whether a keyword obtained after the group word division is matched with the keyword ranking order, and determining, when the keyword obtained after the group word division matches the keyword ranking order, determining The voice information is matched with the operation instruction template; when the keyword obtained after the group word division does not match the keyword arrangement order, it is determined that the voice information does not match the operation instruction template.
  • the first determining submodule includes:
  • a first judgment execution sub-module configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained after the group word division is included in the keyword vocabulary
  • the second judgment sub-module is triggered to perform an operation, and the keyword obtained after the group word division is not included in the keyword word.
  • the trigger display sub-module performs the operation;
  • a display submodule configured to display a keyword obtained by group word division not included in the keyword lexicon when the keyword obtained after the group word division is not included in the keyword vocabulary ;
  • a triggering module configured to: after receiving the confirmation instruction, trigger the second determining sub-module to perform an operation; after receiving the negative instruction, determine that the voice information does not match the operation instruction template.
  • the first determining submodule includes:
  • a second judgment execution sub-module configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained by the group word division includes the instruction keyword, and the group word is performed
  • the third judgment execution sub-module is triggered to perform an operation, and when the keyword obtained after the group word division does not include the instruction keyword, the voice information and the location are determined.
  • the operation instruction templates do not match;
  • a third judgment execution sub-module configured to determine whether a keyword obtained after the group word division is included in the keyword vocabulary, and the keyword obtained after the group word division is included in the keyword lexicon In the middle, the second judgment sub-module is triggered to perform an operation, and when the keyword obtained after the group word division is not included in the keyword vocabulary, determining that the voice information does not match the operation instruction template .
  • FIG. 1 is a flowchart of a method for voice recognition according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for voice recognition according to an embodiment of the present invention.
  • FIG. 2A is a schematic diagram of a system interface in an embodiment of the present invention.
  • FIG. 3 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention.
  • FIG. 4 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention.
  • FIG. 5 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention.
  • FIG. 5A is a block diagram of a voice recognition apparatus according to an embodiment of the present invention.
  • FIG. 6 is a block diagram of an apparatus 600 for speech recognition, according to an exemplary embodiment.
  • FIG. 1 is a flowchart of a method for voice recognition according to an embodiment of the present invention, which may be applied to a terminal.
  • step 11 voice information is received.
  • step 12 it is determined whether the operation information template corresponding to the current service interface matches the voice information, and if yes, step 13 is performed; otherwise, the operation is not performed.
  • step 13 the operation indicated by the voice information is performed.
  • the operation instruction template in the embodiment of the present invention may include: a keyword arrangement order and a keyword vocabulary.
  • Different service interfaces correspond to different operation instruction templates.
  • the navigation service interface corresponds to one operation instruction template
  • the music service interface corresponds to another operation instruction template.
  • the operation instruction template corresponding to the navigation service interface is shown in Table 1.
  • the operation instruction template corresponding to the music service interface is shown in Table 2.
  • FIG. 2 is a flowchart of a method for voice recognition according to an embodiment of the present invention, and the method may be applied to a terminal.
  • step 21 voice information is received.
  • step 22 an operation instruction template corresponding to the current service interface is determined.
  • the interface wake-up command of the voice may be input on the system interface as shown in FIG. 2A, and the system interface displays the current user in a form of a service channel. service.
  • the voice input "opens the music interface”
  • the voice input "opens the navigation interface”.
  • the terminal opens a current service interface corresponding to the interface wake-up instruction, and subsequent operations are performed based on the opened current service interface.
  • the correspondence between the service interface and the operation instruction template is saved in the terminal, so according to the current service interface, the operation instruction template corresponding to the current service interface can be determined.
  • step 23 the received voice information is grouped.
  • the voice information segmentation technology is adopted, and the received voice information is grouped and divided, and the keyword is separated and combined.
  • step 24 it is determined whether the keyword obtained after the group word division is in the keyword lexicon, and when the keyword obtained after the group word division is in the keyword vocabulary, step 25 is performed, and after the group word is divided, When the keyword is not in the keyword lexicon, no action is performed.
  • step 24 when the keyword obtained after the group word division is not in the keyword vocabulary, the keyword not included in the keyword lexicon may be displayed and given to the user.
  • the terminal will receive the confirmation command, and can continue to perform step 25 at this time.
  • the terminal will receive a negative instruction, and the terminal will not execute at this time. operating. Therefore, when the keyword lexicon is not complete, some keywords cannot be recognized. Further, after the user confirms the keyword, the keyword can be updated into the keyword lexicon. Optionally, the user can use the voice input to confirm or negate the instruction.
  • the step of determining whether the keyword obtained after the group word division is in the keyword lexicon is performed, and if the group word division is performed, If the keyword is not included in the keyword, it can be directly determined that the received voice information does not match the last command of the operation command. Therefore, the instruction keyword is included in the received voice information to match the keyword lexicon, thereby improving the processing efficiency.
  • step 25 it is determined whether the keyword obtained after the group word division is matched with the keyword arrangement order, and when the keyword obtained after the group word division matches the keyword arrangement order, the received voice information and the operation instruction are determined. If the templates match, step 26 is executed. When the keywords obtained after the group word division do not match the keyword arrangement order, the operation is not performed.
  • step 26 the operation indicated by the voice information is performed.
  • the voice input interface wakes up the instruction “open the navigation interface”, and after receiving the interface wake-up command, the in-vehicle device opens the navigation service interface. After the navigation service interface is opened, the driver can continue to input the voice information “Navigate to Tiananmen Square”, and the in-vehicle device determines that the voice information matches the operation instruction template corresponding to the navigation service interface, and performs a corresponding navigation operation.
  • the voice information “Navigate to Tiananmen Square”
  • the in-vehicle device determines that the voice information matches the operation instruction template corresponding to the navigation service interface, and performs a corresponding navigation operation.
  • the voice input interface wakes up the instruction “Open the music interface”, and after receiving the interface wake-up command, the in-vehicle device opens the music service interface. After the music service interface is opened, the driver can continue to input the voice information "Play Song 1", and the in-vehicle device determines that the voice information matches the operation instruction template corresponding to the music service interface, and performs a corresponding music playing operation.
  • the voice information received by the in-vehicle device does not conform to the format of "playing song name”. , no operation is performed, and it is avoided that a new play command is mistakenly recognized when a song name of another voice input is received in a small space inside the vehicle, thereby interrupting the currently playing music play service.
  • FIG. 3 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention.
  • the apparatus may be located in a terminal, and includes: a voice information receiving module 31, a determining module 32, and a voice information response module 33.
  • the voice information receiving module 31 is configured to receive voice information.
  • the determining module 32 is configured to determine whether the operation instruction template corresponding to the current service interface matches the voice information, and send the determination result to the voice information response module 33.
  • the voice information response module 33 is configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and not execute when the voice information does not match the operation execution template operating.
  • FIG. 4 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention.
  • the apparatus may be located in a terminal, and includes: a voice information receiving module 31, a determining module 32, a voice information response module 33, and a wakeup module 34.
  • the operation instruction module in the embodiment of the present invention may include: a keyword arrangement order and a keyword vocabulary.
  • the voice information receiving module 31 is configured to receive voice information.
  • the determining module 32 can include a voice information analyzing sub-module 321, a first determining sub-module 322, and a second determining sub-module 323.
  • the voice information analysis sub-module 321 is configured to perform group word division on the voice information.
  • the first judging sub-module 322 is configured to determine, according to the splitting and combining of the keywords obtained after the group word segmentation, whether the keyword obtained after the group word segmentation is included in the keyword vocabulary, When the keyword obtained after the group word division is included in the keyword vocabulary, the second judgment sub-module 323 is triggered to perform an operation, where the operation is performed. When the keyword obtained after the group word division is not included in the keyword vocabulary, it is determined that the voice information does not match the operation instruction template.
  • the determining module 322 determines that the keyword obtained after the group word division is not included in the keyword vocabulary, the user may also be given to the user.
  • the first determining sub-module 322 may further include: a first determining execution sub-module 3221, a display sub-module 3222, and a triggering module 3223.
  • the block diagram of the device containing this part is shown in Figure 5.
  • the first judgment execution sub-module 3221 is configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained after the group word division is included in the keyword vocabulary, When the keyword obtained after the group word division is included in the keyword vocabulary, the second judgment sub-module 323 is triggered to perform an operation, and the keyword obtained after the group word division is not included in the key When in the word dictionary, the trigger display sub-module 3222 performs an operation.
  • the display sub-module 3222 is configured to display, when the keyword obtained after the group word division is not included in the keyword vocabulary, a key that is not included in the keyword lexicon for group word division word.
  • the triggering module 3223 is configured to: after receiving the confirmation instruction, trigger the second determining sub-module 323 to perform an operation; after receiving the negative instruction, determine that the voice information does not match the operation instruction template.
  • the first determining submodule 322 may further include: a second determining executing submodule 3224 and a third determining executing submodule 3225.
  • the block diagram of the device containing this part is shown in Figure 5A.
  • the second judgment execution sub-module 3224 is configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained by the group word division includes the instruction keyword, and the group is performed in the group
  • the third judgment execution sub-module 3225 is triggered to perform an operation, and when the keyword obtained after the group word division does not include the instruction keyword, the voice information is determined. Does not match the operation instruction template.
  • the third judgment execution sub-module 3225 is configured to determine whether the keyword obtained after the group word division is included in the keyword vocabulary, and the keyword obtained after the group word division is included in the keyword word When the library is in the library, the second judgment sub-module 323 is triggered to perform an operation. When the keyword obtained after the group word division is not included in the keyword vocabulary, the voice information and the operation instruction template are determined not to be Match.
  • the second judging sub-module 323 is configured to determine whether the keyword obtained after the group word division is matched with the keyword arrangement order, and when the keyword obtained after the group word division is matched with the keyword arrangement order, Determining that the voice information matches the operation instruction template; and determining that the voice information matches the operation instruction template when the keyword obtained after the group word division does not match the keyword arrangement order.
  • the voice information response module 33 is configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and does not perform an operation when the voice information does not match the operation execution template.
  • the wake-up module 34 is configured to receive an interface wake-up instruction, and open the current service interface corresponding to the interface wake-up instruction.
  • FIG. 6 is a block diagram of an apparatus 600 for speech recognition, according to an exemplary embodiment.
  • device 600 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
  • apparatus 600 can include one or more of the following components: processing component 602, memory 604, power component 606, multimedia component 608, audio component 610, input/output (I/O) interface 612, sensor component 614, And a communication component 616.
  • Processing component 602 typically controls the overall operation of device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • Processing component 602 can include one or more processors 620 to execute instructions to perform all or part of the steps of the speech recognition method described above.
  • processing component 602 can include one or more modules to facilitate interaction between component 602 and other components.
  • processing component 602 can include a multimedia module to facilitate interaction between multimedia component 608 and processing component 602.
  • Memory 604 is configured to store various types of data to support operation at device 600. Examples of such data include instructions for any application or method operating on device 600, contact data, phone book data, messages, pictures, videos, and the like.
  • the memory 604 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • Power component 606 provides power to various components of device 600.
  • Power component 606 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 600.
  • the multimedia component 608 includes a screen between the device 600 and the user that provides an output interface.
  • the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
  • the multimedia component 608 includes a front camera and/or a rear camera. When the device 600 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 610 is configured to output and/or input an audio signal.
  • audio component 610 includes a microphone (MIC) that is configured to receive an external audio signal when device 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
  • the received audio signal may be further stored in memory 604 or transmitted via communication component 616.
  • audio component 610 also includes a speaker for outputting an audio signal.
  • the I/O interface 612 provides an interface between the processing component 602 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
  • Sensor assembly 614 includes one or more sensors for providing device 600 with a status assessment of various aspects.
  • sensor component 614 can detect an open/closed state of device 600, a relative positioning of components, such as the display and keypad of device 600, and sensor component 614 can also detect a change in position of one component of device 600 or device 600. The presence or absence of contact by the user with the device 600, the orientation or acceleration/deceleration of the device 600 and the temperature change of the device 600.
  • Sensor assembly 614 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor component 614 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 616 is configured to facilitate wired or wireless communication between device 600 and other devices.
  • the device 600 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
  • communication component 616 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
  • the communication component 616 also includes a near field communication (NFC) module to facilitate short range communication.
  • NFC near field communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 600 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • non-transitory computer readable storage medium comprising instructions, such as a memory 604 comprising instructions executable by processor 620 of apparatus 600 to perform the above method.
  • the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.

Abstract

A method and device for speech recognition, wherein different operation instruction templates are set for different service interfaces, taking the operation instruction template corresponding to the current service interface as a reference, determining whether the received speech information matches the operation instruction template; execute the operation instructed by the speech information only if the match is successful, so as to prevent taking the input approximate speech information as an operation instruction in the presence of multiple voices to interrupt the service currently provided, thereby accurately recognizing the contents of the operation instruction in the speech information.

Description

语音识别方法及装置Speech recognition method and device 技术领域Technical field
本发明涉及语音技术领域,特别涉及语音识别方法及装置。The present invention relates to the field of voice technologies, and in particular, to a voice recognition method and apparatus.
背景技术Background technique
随着多媒体技术的发展,多媒体系统的服务项目也随之扩展,例如音乐、视频、图片、实时路况信号、目的地地图导航、语音导航等。智能终端的广泛使用给上述服务项目提供了广阔的发展空间。With the development of multimedia technology, the service items of multimedia systems have also expanded, such as music, video, pictures, real-time road condition signals, destination map navigation, voice navigation and so on. The extensive use of intelligent terminals provides a broad space for development of the above service projects.
无论终端带有按键还是触摸屏,都需要人为进行手动操控,才能使用上述服务项目,不仅操作繁琐,还可能具有危险性,例如驾驶员在行车过程中手动操作车载设备就可能发生危险。语音识别技术的发展为此类操作提供了新的发展方向。但是在狭小的内部空间,例如汽车中,使用语音识别来使用上述服务项目时,将会发生多重声音同时存在的情况,如何准确识别语音信息中的操作指令内容成为了急需解决的问题。Regardless of whether the terminal has a button or a touch screen, manual operation is required to use the above service items, which is not only cumbersome, but also dangerous. For example, the driver may be dangerous when manually operating the vehicle equipment during driving. The development of speech recognition technology has provided a new direction for such operations. However, in a small internal space, such as a car, when speech recognition is used to use the above service items, multiple sounds will occur at the same time. How to accurately recognize the content of the operation instructions in the voice information becomes an urgent problem to be solved.
发明内容Summary of the invention
本发明实施例提供一种语音识别方法,能够在存在多重声音时,准确识别语音信息中的操作指令内容。Embodiments of the present invention provide a voice recognition method capable of accurately identifying an operation instruction content in voice information when multiple voices exist.
本发明实施例还提供一种语音识别装置,能够在存在多重声音时,准确识别语音信息中的操作指令内容。The embodiment of the invention further provides a speech recognition device capable of accurately recognizing the content of the operation instruction in the speech information when there are multiple sounds.
本发明实施例提供的语音识别方法,包括:The voice recognition method provided by the embodiment of the present invention includes:
接收语音信息;Receiving voice information;
判断所述语音信息与当前服务界面对应的操作指令模板是否匹配;Determining whether the operation instruction template corresponding to the voice information and the current service interface matches;
若所述语音信息与所述操作指令模板相匹配,执行所述语音信息指示的操作,若所述语音信息与所述操作执行模板不相匹配,不执行操作。And if the voice information matches the operation instruction template, performing an operation indicated by the voice information, and if the voice information does not match the operation execution template, the operation is not performed.
可见,本发明实施例中为不同的服务界面设置了不同的操作指令模板,以当前服务界面对应的操作指令模板为准,判断接收到的语音信息是否与该操作指令模板匹配,若匹配成功,才执行语音信息指示的操作,从而避免在存在多重声音时,将输入的近似语音信息作为操作指令,打断当前正在提供的服务,做到准确识别语音信息中的操作指令内容。It can be seen that, in the embodiment of the present invention, different operation instruction templates are set for different service interfaces, and the operation instruction template corresponding to the current service interface is used as a standard to determine whether the received voice information matches the operation instruction template, and if the matching is successful, The operation indicated by the voice information is performed, thereby avoiding the input of the approximate voice information as an operation instruction when the multiple voices exist, interrupting the service currently being provided, and accurately identifying the content of the operation instruction in the voice information.
作为可选的实施方式,所述操作指令模板,包括:关键词排列顺序和关键词词库。As an optional implementation manner, the operation instruction template includes: a keyword arrangement order and a keyword vocabulary.
可见,本发明实施例中的操作指令模板,不仅包含关键词词库,还包括关键词排列顺序,从而提高了与操作指令模板匹配的标准,做到更准确的识别语音信息中的操作指令内容。It can be seen that the operation instruction template in the embodiment of the present invention not only includes the keyword lexicon, but also includes the keyword arrangement order, thereby improving the standard matching with the operation instruction template, and more accurately identifying the operation instruction content in the voice information. .
作为一种可选的实施方式,所述判断所述语音信息与所述操作指令模板是否匹配,包 括:As an optional implementation manner, the determining whether the voice information matches the operation instruction template, include:
对所述语音信息进行组词划分;Performing group word division on the voice information;
根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词是否包含在所述关键词词库中;Determining whether the keywords obtained after the group word division are included in the keyword vocabulary are determined according to the splitting and combining of the keywords obtained after the group word division;
若所述进行组词划分后得到的关键词包含在所述关键词词库中,判断进行组词划分后得到的关键词是否与所述关键词排列顺序匹配;若进行组词划分后得到的关键词与所述关键词排列顺序相匹配,确定所述语音信息与所述操作指令模板相匹配;若进行组词划分后得到的关键词与所述关键词排列顺序不相匹配,确定所述语音信息与所述操作指令模板不相匹配;If the keyword obtained after the group word division is included in the keyword vocabulary, it is determined whether the keyword obtained after the group word division is matched with the keyword arrangement order; if the group word division is obtained The keyword is matched with the keyword arrangement order, and the voice information is determined to match the operation instruction template; if the keyword obtained after group word division does not match the keyword arrangement order, determining the The voice information does not match the operation instruction template;
若所述进行组词划分后得到的关键词未包含在所述关键词词库中,确定所述语音信息与所述操作指令模板不相匹配。If the keyword obtained after the group word division is not included in the keyword vocabulary, it is determined that the voice information does not match the operation instruction template.
可见,本发明实施例中采用了语音信息分割技术,将接收到的语音信息进行组词划分,实现语音信息精确识别的效果。It can be seen that the voice information segmentation technology is adopted in the embodiment of the present invention, and the received voice information is divided into group words to realize the effect of accurately identifying the voice information.
作为一种可选的实施方式,所述方法还包括:As an optional implementation manner, the method further includes:
若所述进行组词划分后得到的关键词未包含在所述关键词词库中,显示未包含在所述关键词词库中的进行组词划分得到的关键词;If the keyword obtained after the group word division is not included in the keyword vocabulary, the keyword obtained by group word division not included in the keyword vocabulary is displayed;
当接收到确认指令之后,继续执行所述判断进行组词划分后得到的关键词是否与所述关键词排列顺序匹配的步骤;当接到否定指令之后,确定所述语音信息与所述操作指令模板不相匹配。After receiving the confirmation instruction, proceeding to perform the step of determining whether the keyword obtained after the group word division matches the keyword arrangement order; and after receiving the negative instruction, determining the voice information and the operation instruction Templates do not match.
可见,本发明实施例中,当进行组词划分后得到的关键词未包含在关键词词库中时,可以进一步将该关键词显示出来,如果接收到确认指令则继续执行判断该关键词是否与关键词排列顺序匹配的步骤,由此避免关键词词库不全时,对某些进行组词划分后得到的关键词做出错误判断。It can be seen that, in the embodiment of the present invention, when the keyword obtained after the group word division is not included in the keyword vocabulary, the keyword may be further displayed, and if the confirmation instruction is received, the execution of the keyword is continued to determine whether the keyword is The step of matching the keyword arrangement order, thereby avoiding the keyword lexical incompleteness, and making an incorrect judgment on some keywords obtained after group word division.
作为一种可选的实施方式,所述方法还包括:As an optional implementation manner, the method further includes:
所述判断进行组词划分后得到的关键词是否包含在所述关键词词库中之前,判断进行组词划分后得到的关键词中是否包含指令关键词;Determining whether the keyword obtained after the group word division is included in the keyword lexicon, and determining whether the keyword obtained after the group word division includes the instruction keyword;
若所述进行组词划分后得到的关键词中包含指令关键词,继续执行所述判断进行组词划分后得到的关键词是否包含在所述关键词词库中的步骤;若所述进行组词划分后得到的关键词中不包含指令关键词,确定所述语音信息与所述操作指令模板不相匹配。If the keyword obtained after the group word division includes the instruction keyword, the step of performing the judgment to perform the group word division is included in the keyword lexicon; if the group is performed The keyword obtained after the word division does not include the instruction keyword, and the voice information is determined to not match the operation instruction template.
可见,在本发明实施例中先判断语音信息中是否包括指令关键词,只有在包括指令关键词的基础上才会进一步判断语音信息中的关键词是否包含在关键词词库中,提升处理效率。It can be seen that, in the embodiment of the present invention, it is first determined whether the instruction keyword is included in the voice information, and only if the instruction keyword is included, whether the keyword in the voice information is included in the keyword vocabulary is further determined, thereby improving processing efficiency. .
本发明实施例提供一种语音识别装置,包括:An embodiment of the present invention provides a voice recognition apparatus, including:
语音信息接收模块,用于接收语音信息;a voice information receiving module, configured to receive voice information;
判断模块,用于判断所述语音信息与当前服务界面对应的操作指令模板是否匹配; a determining module, configured to determine whether the operation instruction template corresponding to the voice information and the current service interface matches;
语音信息响应模块,用于在所述语音信息与所述操作指令模板相匹配时,执行所述语音信息指示的操作,在所述语音信息与所述操作执行模板不相匹配时,不执行操作。a voice information response module, configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and do not perform an operation when the voice information does not match the operation execution template .
作为一种可选的实施方式,所述操作指令模板,包括:关键词排列顺序和关键词词库。As an optional implementation manner, the operation instruction template includes: a keyword arrangement order and a keyword vocabulary.
作为一种可选的实施方式,所述判断模块,包括:As an optional implementation manner, the determining module includes:
语音信息分析子模块,用于对所述语音信息进行组词划分;第一判断子模块,用于根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词是否包含在所述关键词词库中,在所述进行组词划分后得到的关键词包含在所述关键词词库中时,触发第二判断子模块执行操作,在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,确定所述语音信息与所述操作指令模板不相匹配;a voice information analysis sub-module, configured to perform group word division on the voice information; a first judgment sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word division, determine the group word division Whether the obtained keyword is included in the keyword vocabulary, and when the keyword obtained after the group word division is included in the keyword vocabulary, triggering the second determining sub-module to perform an operation, When the keyword obtained after the group word division is not included in the keyword vocabulary, it is determined that the voice information does not match the operation instruction template;
第二判断子模块,用于判断进行组词划分后得到的关键词是否与所述关键词排列顺序匹配,在进行组词划分后得到的关键词与所述关键词排列顺序相匹配时,确定所述语音信息与所述操作指令模板相匹配;在进行组词划分后得到的关键词与所述关键词排列顺序不相匹配时,确定所述语音信息与所述操作指令模板不相匹配。a second determining sub-module, configured to determine whether a keyword obtained after the group word division is matched with the keyword ranking order, and determining, when the keyword obtained after the group word division matches the keyword ranking order, determining The voice information is matched with the operation instruction template; when the keyword obtained after the group word division does not match the keyword arrangement order, it is determined that the voice information does not match the operation instruction template.
作为一种可选的实施方式,所述第一判断子模块,包括:As an optional implementation manner, the first determining submodule includes:
第一判断执行子模块,用于根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词是否包含在所述关键词词库中,在所述进行组词划分后得到的关键词包含在所述关键词词库中时,触发第二判断子模块执行操作,在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,触发显示子模块执行操作;a first judgment execution sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained after the group word division is included in the keyword vocabulary When the keyword obtained after the group word division is included in the keyword lexicon, the second judgment sub-module is triggered to perform an operation, and the keyword obtained after the group word division is not included in the keyword word. When the library is in the library, the trigger display sub-module performs the operation;
显示子模块,用于在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,显示未包含在所述关键词词库中的进行组词划分得到的关键词;a display submodule, configured to display a keyword obtained by group word division not included in the keyword lexicon when the keyword obtained after the group word division is not included in the keyword vocabulary ;
触发模块,用于在接收到确认指令之后,触发所述第二判断子模块执行操作;在接到否定指令之后,确定所述语音信息与所述操作指令模板不相匹配。And a triggering module, configured to: after receiving the confirmation instruction, trigger the second determining sub-module to perform an operation; after receiving the negative instruction, determine that the voice information does not match the operation instruction template.
作为一种可选的实施方式,所述第一判断子模块,包括:As an optional implementation manner, the first determining submodule includes:
第二判断执行子模块,用于根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词中是否包含指令关键词,在所述进行组词划分后得到的关键词中包含指令关键词时,触发第三判断执行子模块执行操作,在所述进行组词划分后得到的关键词中不包含指令关键词时,确定所述语音信息与所述操作指令模板不相匹配;a second judgment execution sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained by the group word division includes the instruction keyword, and the group word is performed When the keyword obtained by the division includes the instruction keyword, the third judgment execution sub-module is triggered to perform an operation, and when the keyword obtained after the group word division does not include the instruction keyword, the voice information and the location are determined. The operation instruction templates do not match;
第三判断执行子模块,用于判断进行组词划分后得到的关键词是否包含在所述关键词词库中,在所述进行组词划分后得到的关键词包含在所述关键词词库中时,触发第二判断子模块执行操作,在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,确定所述语音信息与所述操作指令模板不相匹配。a third judgment execution sub-module, configured to determine whether a keyword obtained after the group word division is included in the keyword vocabulary, and the keyword obtained after the group word division is included in the keyword lexicon In the middle, the second judgment sub-module is triggered to perform an operation, and when the keyword obtained after the group word division is not included in the keyword vocabulary, determining that the voice information does not match the operation instruction template .
附图说明DRAWINGS
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合发明的实施例,并与说明书一起用于解释本发明的原理。 The accompanying drawings, which are incorporated in and constitute in the claims
图1为本发明实施例中一种语音识别的方法流程图;1 is a flowchart of a method for voice recognition according to an embodiment of the present invention;
图2为本发明实施例中一种语音识别的方法流程图;2 is a flowchart of a method for voice recognition according to an embodiment of the present invention;
图2A为本发明实施例中的系统界面示意图;2A is a schematic diagram of a system interface in an embodiment of the present invention;
图3为本发明实施例中一种语音识别装置的框图;3 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention;
图4为本发明实施例中一种语音识别装置的框图;4 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention;
图5为本发明实施例中一种语音识别装置的框图;FIG. 5 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention; FIG.
图5A为本发明实施例中一种语音识别装置的框图;FIG. 5A is a block diagram of a voice recognition apparatus according to an embodiment of the present invention; FIG.
图6是根据一示例性实施例示出的一种用于语音识别的装置600的框图。FIG. 6 is a block diagram of an apparatus 600 for speech recognition, according to an exemplary embodiment.
具体实施方式detailed description
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. The following description refers to the same or similar elements in the different figures unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Instead, they are merely examples of devices and methods consistent with aspects of the invention as detailed in the appended claims.
图1为本发明实施例中一种语音识别的方法流程图,可以应用于终端中。FIG. 1 is a flowchart of a method for voice recognition according to an embodiment of the present invention, which may be applied to a terminal.
在步骤11中,接收语音信息。In step 11, voice information is received.
在步骤12中,判断所述语音信息与当前服务界面对应的操作指令模板是否匹配,如果相匹配则执行步骤13,否则不执行操作。In step 12, it is determined whether the operation information template corresponding to the current service interface matches the voice information, and if yes, step 13 is performed; otherwise, the operation is not performed.
在步骤13中,执行所述语音信息指示的操作。In step 13, the operation indicated by the voice information is performed.
本发明实施例中的操作指令模板可以包括:关键词排列顺序和关键词词库。不同的服务界面对应不同的操作指令模板,例如导航服务界面对应一个操作指令模板,音乐服务界面对应另一个操作指令模板。The operation instruction template in the embodiment of the present invention may include: a keyword arrangement order and a keyword vocabulary. Different service interfaces correspond to different operation instruction templates. For example, the navigation service interface corresponds to one operation instruction template, and the music service interface corresponds to another operation instruction template.
以导航服务为例,导航服务界面对应的操作指令模板如表一所示。Take the navigation service as an example. The operation instruction template corresponding to the navigation service interface is shown in Table 1.
表一Table I
Figure PCTCN2016084463-appb-000001
Figure PCTCN2016084463-appb-000001
以音乐服务为例,音乐服务界面对应的操作指令模板如表二所示。Take the music service as an example, the operation instruction template corresponding to the music service interface is shown in Table 2.
表二Table II
Figure PCTCN2016084463-appb-000002
Figure PCTCN2016084463-appb-000002
在上述表一和表二所示的关键词模板中,存在一类指令关键词,例如导航操作指令模板中的“导航到”,又例如音乐操作指令模板中的“播放”。可见,指令关键词通常为动 词。In the keyword templates shown in Tables 1 and 2 above, there is a type of instruction keyword, such as "navigation to" in the navigation operation instruction template, and "play" in the music operation instruction template, for example. It can be seen that the instruction keyword is usually dynamic. word.
图2为本发明实施例中一种语音识别的方法流程图,该方法可以应用于终端中。FIG. 2 is a flowchart of a method for voice recognition according to an embodiment of the present invention, and the method may be applied to a terminal.
在步骤21中,接收语音信息。In step 21, voice information is received.
在步骤22中,确定与当前服务界面对应的操作指令模板。In step 22, an operation instruction template corresponding to the current service interface is determined.
作为一种可选的实施方式,终端用户要使用服务时,可以在如图2A所示的系统界面上输入语音的界面唤醒指令,该系统界面上以服务频道的形式集中显示当前用户可以使用的服务。例如想使用音乐服务时,语音输入“打开音乐界面”,想使用导航服务时,语音输入“打开导航界面”。终端在接收到界面唤醒指令后,打开与界面唤醒指令对应的当前服务界面,后续的操作将基于该打开的当前服务界面执行。终端中保存有服务界面与操作指令模板的对应关系,因此根据当前服务界面,可以确定与当前服务界面对应的操作指令模板。As an optional implementation manner, when the terminal user wants to use the service, the interface wake-up command of the voice may be input on the system interface as shown in FIG. 2A, and the system interface displays the current user in a form of a service channel. service. For example, when you want to use the music service, the voice input "opens the music interface", and when you want to use the navigation service, the voice input "opens the navigation interface". After receiving the interface wake-up instruction, the terminal opens a current service interface corresponding to the interface wake-up instruction, and subsequent operations are performed based on the opened current service interface. The correspondence between the service interface and the operation instruction template is saved in the terminal, so according to the current service interface, the operation instruction template corresponding to the current service interface can be determined.
在步骤23中,对接收到的语音信息进行组词划分。In step 23, the received voice information is grouped.
作为一种可选的实施方式,采用语音信息分割技术,对接收到的语音信息进行组词划分,得到关键词的分拆和组合。As an optional implementation manner, the voice information segmentation technology is adopted, and the received voice information is grouped and divided, and the keyword is separated and combined.
在步骤24中,判断进行组词划分后得到的关键词是否在关键词词库中,当组词划分后得到的关键词在关键词词库中时,执行步骤25,当组词划分后得到的关键词不在关键词词库中时,不执行操作。In step 24, it is determined whether the keyword obtained after the group word division is in the keyword lexicon, and when the keyword obtained after the group word division is in the keyword vocabulary, step 25 is performed, and after the group word is divided, When the keyword is not in the keyword lexicon, no action is performed.
作为一种可选的替换实施方式,在步骤24中,当组词划分后得到的关键词不在关键词词库中时,可以显示未包含在关键词词库中的该关键词,并给用户提供确认或否定的功能选项,当用户确认该关键词之后,终端将接收到确认指令,此时可以继续执行步骤25,当用户否定该关键词之后,终端将接收到否定指令,此时不执行操作。由此,避免关键词词库不全时,有的关键词无法被识别。进一步,当用户确认该关键词之后,可以将该关键词更新到关键词词库中。可选的,这里用户可以使用语音输入确认或否定的指令。As an optional alternative embodiment, in step 24, when the keyword obtained after the group word division is not in the keyword vocabulary, the keyword not included in the keyword lexicon may be displayed and given to the user. Providing a confirmation or negative function option, after the user confirms the keyword, the terminal will receive the confirmation command, and can continue to perform step 25 at this time. After the user denies the keyword, the terminal will receive a negative instruction, and the terminal will not execute at this time. operating. Therefore, when the keyword lexicon is not complete, some keywords cannot be recognized. Further, after the user confirms the keyword, the keyword can be updated into the keyword lexicon. Optionally, the user can use the voice input to confirm or negate the instruction.
作为另一种可选的替换实施方式,在判断进行组词划分后得到的关键词是否在关键词词库中之前,先判断进行组词划分后得到的关键词中是否包含指令关键词,只有在确定进行组词划分后得到的关键词中包含指令关键词的情况下,才执行判断进行组词划分后得到的关键词是否在关键词词库中的步骤,若进行组词划分后得到的关键词中不包含指令关键词,则可以直接确定接收到的语音信息与操作指令末班不相匹配。由此,在确定接收到的语音信息中包括指令关键词才去匹配关键词词库,提高了处理效率。As another optional alternative embodiment, before determining whether the keyword obtained after the group word division is in the keyword lexicon, it is first determined whether the keyword obtained by the group word division includes the instruction keyword, and only When it is determined that the keyword obtained after the group word division includes the instruction keyword, the step of determining whether the keyword obtained after the group word division is in the keyword lexicon is performed, and if the group word division is performed, If the keyword is not included in the keyword, it can be directly determined that the received voice information does not match the last command of the operation command. Therefore, the instruction keyword is included in the received voice information to match the keyword lexicon, thereby improving the processing efficiency.
在步骤25中,判断进行组词划分后得到的关键词是否与关键词排列顺序匹配,当组词划分后得到的关键词与关键词排列顺序相匹配时,确定接收到的语音信息与操作指令模板相匹配,执行步骤26,当组词划分后得到的关键词与关键词排列顺序不相匹配时,不执行操作。In step 25, it is determined whether the keyword obtained after the group word division is matched with the keyword arrangement order, and when the keyword obtained after the group word division matches the keyword arrangement order, the received voice information and the operation instruction are determined. If the templates match, step 26 is executed. When the keywords obtained after the group word division do not match the keyword arrangement order, the operation is not performed.
在步骤26中,执行所述语音信息指示的操作。In step 26, the operation indicated by the voice information is performed.
按照图1或图2所示的方法,下面举出几个具体的应用场景。以终端是车载设备为例。 According to the method shown in FIG. 1 or FIG. 2, several specific application scenarios are given below. Take the terminal as an in-vehicle device as an example.
当驾驶员想使用导航服务时,语音输入界面唤醒指令“打开导航界面”,车载设备接收到该界面唤醒指令后,打开导航服务界面。在导航服务界面被打开之后,驾驶员可以继续输入语音信息“导航到天安门”,车载设备判断该语音信息与导航服务界面对应的操作指令模板相匹配,执行对应的导航操作。在提供导航服务的过程中,假设车内其他乘客和驾驶员谈论旅游景点,有可能会提到多个地名,此时只要车载设备接收到的语音信息不符合“导航到地名”的格式,则不执行任何操作,避免了在车内狭小空间内接收到其他语音输入的地名时误认为是新的导航指令,从而打断当前正在进行的导航服务。When the driver wants to use the navigation service, the voice input interface wakes up the instruction “open the navigation interface”, and after receiving the interface wake-up command, the in-vehicle device opens the navigation service interface. After the navigation service interface is opened, the driver can continue to input the voice information “Navigate to Tiananmen Square”, and the in-vehicle device determines that the voice information matches the operation instruction template corresponding to the navigation service interface, and performs a corresponding navigation operation. In the process of providing navigation services, it is assumed that other passengers and drivers in the car talk about tourist attractions, and it is possible to mention multiple place names. In this case, as long as the voice information received by the in-vehicle device does not conform to the format of “navigate to place name”, No operation is performed, and it is avoided that a new navigation instruction is mistaken when receiving a place name of another voice input in a small space inside the vehicle, thereby interrupting the navigation service currently being performed.
当驾驶员想使用音乐服务时,语音输入界面唤醒指令“打开音乐界面”,车载设备接收到该界面唤醒指令后,打开音乐服务界面。在音乐服务界面被打开之后,驾驶员可以继续输入语音信息“播放歌曲1”,车载设备判断该语音信息与音乐服务界面对应的操作指令模板相匹配,执行对应的音乐播放操作。在提供音乐播放的过程中,假设车内其他乘客和驾驶员谈论当前流行歌曲,有可能会提到多个歌曲名,此时只要车载设备接收到的语音信息不符合“播放歌曲名”的格式,则不执行任何操作,避免了在车内狭小空间内接收到其他语音输入的歌曲名时误认为是新的播放指令,从而打断当前正在进行的音乐播放服务。When the driver wants to use the music service, the voice input interface wakes up the instruction “Open the music interface”, and after receiving the interface wake-up command, the in-vehicle device opens the music service interface. After the music service interface is opened, the driver can continue to input the voice information "Play Song 1", and the in-vehicle device determines that the voice information matches the operation instruction template corresponding to the music service interface, and performs a corresponding music playing operation. In the process of providing music playback, it is assumed that other passengers and drivers in the car talk about the current popular songs, and it is possible to mention multiple song names. In this case, as long as the voice information received by the in-vehicle device does not conform to the format of "playing song name". , no operation is performed, and it is avoided that a new play command is mistakenly recognized when a song name of another voice input is received in a small space inside the vehicle, thereby interrupting the currently playing music play service.
下面给出本发明实施例中语音识别装置的举例,这些装置可以实现上文中所述的语音识别方法。这些装置中的各个模块或子模块,功能对应于方法流程中的相应步骤,相关详细的解释在上文中已经给出,下面将不再赘述。An example of a speech recognition apparatus in an embodiment of the present invention, which can implement the speech recognition method described above, is given below. The individual modules or sub-modules of these devices correspond to the corresponding steps in the method flow, and the relevant detailed explanations have been given above, and will not be described below.
图3为本发明实施例中一种语音识别装置的框图,该装置可以位于终端中,包括:语音信息接收模块31、判断模块32和语音信息响应模块33。FIG. 3 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention. The apparatus may be located in a terminal, and includes: a voice information receiving module 31, a determining module 32, and a voice information response module 33.
语音信息接收模块31,用于接收语音信息。The voice information receiving module 31 is configured to receive voice information.
判断模块32,用于判断所述语音信息与当前服务界面对应的操作指令模板是否匹配,将判断结果发送给语音信息响应模块33。The determining module 32 is configured to determine whether the operation instruction template corresponding to the current service interface matches the voice information, and send the determination result to the voice information response module 33.
语音信息响应模块33,用于在所述语音信息与所述操作指令模板相匹配时,执行所述语音信息指示的操作,在所述语音信息与所述操作执行模板不相匹配时,不执行操作。The voice information response module 33 is configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and not execute when the voice information does not match the operation execution template operating.
图4为本发明实施例中一种语音识别装置的框图,该装置可以位于终端中,包括:语音信息接收模块31、判断模块32、语音信息响应模块33和唤醒模块34。FIG. 4 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention. The apparatus may be located in a terminal, and includes: a voice information receiving module 31, a determining module 32, a voice information response module 33, and a wakeup module 34.
本发明实施例中的操作指令模块可以包括:关键词排列顺序和关键词词库。The operation instruction module in the embodiment of the present invention may include: a keyword arrangement order and a keyword vocabulary.
语音信息接收模块31,用于接收语音信息。The voice information receiving module 31 is configured to receive voice information.
判断模块32可以包括语音信息分析子模块321、第一判断子模块322和第二判断子模块323。The determining module 32 can include a voice information analyzing sub-module 321, a first determining sub-module 322, and a second determining sub-module 323.
语音信息分析子模块321,用于对所述语音信息进行组词划分。The voice information analysis sub-module 321 is configured to perform group word division on the voice information.
第一判断子模块322,用于根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词是否包含在所述关键词词库中,在所述进行组词划分后得到的关键词包含在所述关键词词库中时,触发第二判断子模块323执行操作,在所述进行 组词划分后得到的关键词未包含在所述关键词词库中时,确定所述语音信息与所述操作指令模板不相匹配。The first judging sub-module 322 is configured to determine, according to the splitting and combining of the keywords obtained after the group word segmentation, whether the keyword obtained after the group word segmentation is included in the keyword vocabulary, When the keyword obtained after the group word division is included in the keyword vocabulary, the second judgment sub-module 323 is triggered to perform an operation, where the operation is performed. When the keyword obtained after the group word division is not included in the keyword vocabulary, it is determined that the voice information does not match the operation instruction template.
作为一种可选的替换方式,为了避免关键词词库不全,在对判断模块322判断所述进行组词划分后得到的关键词未包含在所述关键词词库中时,还可以给用户提供显示确认的可选功能。在这种情况下,第一判断子模块322可以进一步包括:第一判断执行子模块3221、显示子模块3222和触发模块3223。包含这部分的装置框图如图5所示。As an optional alternative, in order to prevent the keyword lexicon from being incomplete, when the determining module 322 determines that the keyword obtained after the group word division is not included in the keyword vocabulary, the user may also be given to the user. Provides optional features to display confirmation. In this case, the first determining sub-module 322 may further include: a first determining execution sub-module 3221, a display sub-module 3222, and a triggering module 3223. The block diagram of the device containing this part is shown in Figure 5.
第一判断执行子模块3221,用于根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词是否包含在所述关键词词库中,在所述进行组词划分后得到的关键词包含在所述关键词词库中时,触发第二判断子模块323执行操作,在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,触发显示子模块3222执行操作。The first judgment execution sub-module 3221 is configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained after the group word division is included in the keyword vocabulary, When the keyword obtained after the group word division is included in the keyword vocabulary, the second judgment sub-module 323 is triggered to perform an operation, and the keyword obtained after the group word division is not included in the key When in the word dictionary, the trigger display sub-module 3222 performs an operation.
显示子模块3222,用于在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,显示未包含在所述关键词词库中的进行组词划分得到的关键词。The display sub-module 3222 is configured to display, when the keyword obtained after the group word division is not included in the keyword vocabulary, a key that is not included in the keyword lexicon for group word division word.
触发模块3223,用于在接收到确认指令之后,触发第二判断子模块323执行操作;在接到否定指令之后,确定所述语音信息与所述操作指令模板不相匹配。The triggering module 3223 is configured to: after receiving the confirmation instruction, trigger the second determining sub-module 323 to perform an operation; after receiving the negative instruction, determine that the voice information does not match the operation instruction template.
作为另一种可选的实施方式,为了提升处理效率,第一判断子模块322可以进一步包括:第二判断执行子模块3224和第三判断执行子模块3225。包含这部分的装置框图如图5A所示。As another optional implementation manner, in order to improve processing efficiency, the first determining submodule 322 may further include: a second determining executing submodule 3224 and a third determining executing submodule 3225. The block diagram of the device containing this part is shown in Figure 5A.
第二判断执行子模块3224,用于根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词中是否包含指令关键词,在所述进行组词划分后得到的关键词中包含指令关键词时,触发第三判断执行子模块3225执行操作,在所述进行组词划分后得到的关键词中不包含指令关键词时,确定所述语音信息与所述操作指令模板不相匹配。The second judgment execution sub-module 3224 is configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained by the group word division includes the instruction keyword, and the group is performed in the group When the keyword obtained by the word division includes the instruction keyword, the third judgment execution sub-module 3225 is triggered to perform an operation, and when the keyword obtained after the group word division does not include the instruction keyword, the voice information is determined. Does not match the operation instruction template.
第三判断执行子模块3225,用于判断进行组词划分后得到的关键词是否包含在所述关键词词库中,在所述进行组词划分后得到的关键词包含在所述关键词词库中时,触发第二判断子模块323执行操作,在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,确定所述语音信息与所述操作指令模板不相匹配。The third judgment execution sub-module 3225 is configured to determine whether the keyword obtained after the group word division is included in the keyword vocabulary, and the keyword obtained after the group word division is included in the keyword word When the library is in the library, the second judgment sub-module 323 is triggered to perform an operation. When the keyword obtained after the group word division is not included in the keyword vocabulary, the voice information and the operation instruction template are determined not to be Match.
第二判断子模块323,用于判断进行组词划分后得到的关键词是否与所述关键词排列顺序匹配,在进行组词划分后得到的关键词与所述关键词排列顺序相匹配时,确定所述语音信息与所述操作指令模板相匹配;在进行组词划分后得到的关键词与所述关键词排列顺序不相匹配时,确定所述语音信息与所述操作指令模板相匹配。The second judging sub-module 323 is configured to determine whether the keyword obtained after the group word division is matched with the keyword arrangement order, and when the keyword obtained after the group word division is matched with the keyword arrangement order, Determining that the voice information matches the operation instruction template; and determining that the voice information matches the operation instruction template when the keyword obtained after the group word division does not match the keyword arrangement order.
语音信息响应模块33,用于在所述语音信息与所述操作指令模板匹配时,执行所述语音信息指示的操作,在所述语音信息与所述操作执行模板不匹配时,不执行操作。The voice information response module 33 is configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and does not perform an operation when the voice information does not match the operation execution template.
唤醒模块34,用于接收界面唤醒指令,打开与所述界面唤醒指令对应的所述当前服务界面。 The wake-up module 34 is configured to receive an interface wake-up instruction, and open the current service interface corresponding to the interface wake-up instruction.
图6是根据一示例性实施例示出的一种用于语音识别的装置600的框图。例如,装置600可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。FIG. 6 is a block diagram of an apparatus 600 for speech recognition, according to an exemplary embodiment. For example, device 600 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
参照图6,装置600可以包括以下一个或多个组件:处理组件602,存储器604,电力组件606,多媒体组件608,音频组件610,输入/输出(I/O)的接口612,传感器组件614,以及通信组件616。Referring to Figure 6, apparatus 600 can include one or more of the following components: processing component 602, memory 604, power component 606, multimedia component 608, audio component 610, input/output (I/O) interface 612, sensor component 614, And a communication component 616.
处理组件602通常控制装置600的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件602可以包括一个或多个处理器620来执行指令,以完成上述语音识别方法的全部或部分步骤。此外,处理组件602可以包括一个或多个模块,便于处理组件602和其他组件之间的交互。例如,处理组件602可以包括多媒体模块,以方便多媒体组件608和处理组件602之间的交互。 Processing component 602 typically controls the overall operation of device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. Processing component 602 can include one or more processors 620 to execute instructions to perform all or part of the steps of the speech recognition method described above. Moreover, processing component 602 can include one or more modules to facilitate interaction between component 602 and other components. For example, processing component 602 can include a multimedia module to facilitate interaction between multimedia component 608 and processing component 602.
存储器604被配置为存储各种类型的数据以支持在设备600的操作。这些数据的示例包括用于在装置600上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器604可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。 Memory 604 is configured to store various types of data to support operation at device 600. Examples of such data include instructions for any application or method operating on device 600, contact data, phone book data, messages, pictures, videos, and the like. The memory 604 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
电力组件606为装置600的各种组件提供电力。电力组件606可以包括电源管理系统,一个或多个电源,及其他与为装置600生成、管理和分配电力相关联的组件。 Power component 606 provides power to various components of device 600. Power component 606 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 600.
多媒体组件608包括在所述装置600和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件608包括一个前置摄像头和/或后置摄像头。当设备600处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。The multimedia component 608 includes a screen between the device 600 and the user that provides an output interface. In some embodiments, the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 608 includes a front camera and/or a rear camera. When the device 600 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
音频组件610被配置为输出和/或输入音频信号。例如,音频组件610包括一个麦克风(MIC),当装置600处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器604或经由通信组件616发送。在一些实施例中,音频组件610还包括一个扬声器,用于输出音频信号。The audio component 610 is configured to output and/or input an audio signal. For example, audio component 610 includes a microphone (MIC) that is configured to receive an external audio signal when device 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may be further stored in memory 604 or transmitted via communication component 616. In some embodiments, audio component 610 also includes a speaker for outputting an audio signal.
I/O接口612为处理组件602和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。 The I/O interface 612 provides an interface between the processing component 602 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
传感器组件614包括一个或多个传感器,用于为装置600提供各个方面的状态评估。例如,传感器组件614可以检测到设备600的打开/关闭状态,组件的相对定位,例如所述组件为装置600的显示器和小键盘,传感器组件614还可以检测装置600或装置600一个组件的位置改变,用户与装置600接触的存在或不存在,装置600方位或加速/减速和装置600的温度变化。传感器组件614可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件614还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件614还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。 Sensor assembly 614 includes one or more sensors for providing device 600 with a status assessment of various aspects. For example, sensor component 614 can detect an open/closed state of device 600, a relative positioning of components, such as the display and keypad of device 600, and sensor component 614 can also detect a change in position of one component of device 600 or device 600. The presence or absence of contact by the user with the device 600, the orientation or acceleration/deceleration of the device 600 and the temperature change of the device 600. Sensor assembly 614 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor component 614 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
通信组件616被配置为便于装置600和其他设备之间有线或无线方式的通信。装置600可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件616经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件616还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。 Communication component 616 is configured to facilitate wired or wireless communication between device 600 and other devices. The device 600 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, communication component 616 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 616 also includes a near field communication (NFC) module to facilitate short range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
在示例性实施例中,装置600可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。In an exemplary embodiment, device 600 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器604,上述指令可由装置600的处理器620执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium comprising instructions, such as a memory 604 comprising instructions executable by processor 620 of apparatus 600 to perform the above method. For example, the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本发明的其它实施方案。说明书旨在涵盖本发明的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本发明的一般性原理并包括本发明未公开的本技术领域中的公知常识或惯用技术手段。实施例仅被视为示例性的,本发明的真正范围和精神由权利要求指出。 Other embodiments of the invention will be apparent to those skilled in the <RTIgt; The description is intended to cover any variations, uses, or adaptations of the invention, which are in accordance with the general principles of the invention and include common general knowledge or common technical means in the art that are not disclosed. The examples are to be considered as illustrative only, and the true scope and spirit of the invention are indicated by the claims.

Claims (10)

  1. 一种语音识别方法,其特征在于,所述方法包括:A speech recognition method, characterized in that the method comprises:
    接收语音信息;Receiving voice information;
    判断所述语音信息与当前服务界面对应的操作指令模板是否匹配;Determining whether the operation instruction template corresponding to the voice information and the current service interface matches;
    若所述语音信息与所述操作指令模板相匹配,执行所述语音信息指示的操作,若所述语音信息与所述操作执行模板不相匹配,不执行操作。And if the voice information matches the operation instruction template, performing an operation indicated by the voice information, and if the voice information does not match the operation execution template, the operation is not performed.
  2. 如权利要求1所述的方法,其特征在于,所述操作指令模板,包括:关键词排列顺序和关键词词库。The method of claim 1, wherein the operation instruction template comprises: a keyword arrangement order and a keyword vocabulary.
  3. 如权利要求2所述的方法,其特征在于,所述判断所述语音信息与所述操作指令模板是否匹配,包括:The method of claim 2, wherein the determining whether the voice information matches the operation instruction template comprises:
    对所述语音信息进行组词划分;Performing group word division on the voice information;
    根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词是否包含在所述关键词词库中;Determining whether the keywords obtained after the group word division are included in the keyword vocabulary are determined according to the splitting and combining of the keywords obtained after the group word division;
    若所述进行组词划分后得到的关键词包含在所述关键词词库中,判断进行组词划分后得到的关键词是否与所述关键词排列顺序匹配;若进行组词划分后得到的关键词与所述关键词排列顺序相匹配,确定所述语音信息与所述操作指令模板相匹配;若进行组词划分后得到的关键词与所述关键词排列顺序不相匹配,确定所述语音信息与所述操作指令模板不相匹配;If the keyword obtained after the group word division is included in the keyword vocabulary, it is determined whether the keyword obtained after the group word division is matched with the keyword arrangement order; if the group word division is obtained The keyword is matched with the keyword arrangement order, and the voice information is determined to match the operation instruction template; if the keyword obtained after group word division does not match the keyword arrangement order, determining the The voice information does not match the operation instruction template;
    若所述进行组词划分后得到的关键词未包含在所述关键词词库中,确定所述语音信息与所述操作指令模板不相匹配。If the keyword obtained after the group word division is not included in the keyword vocabulary, it is determined that the voice information does not match the operation instruction template.
  4. 如权利要求3所述的方法,其特征在于,所述方法还包括:The method of claim 3, wherein the method further comprises:
    若所述进行组词划分后得到的关键词未包含在所述关键词词库中,显示未包含在所述关键词词库中的进行组词划分得到的关键词;If the keyword obtained after the group word division is not included in the keyword vocabulary, the keyword obtained by group word division not included in the keyword vocabulary is displayed;
    当接收到确认指令之后,继续执行所述判断进行组词划分后得到的关键词是否与所述关键词排列顺序匹配的步骤;当接到否定指令之后,确定所述语音信息与所述操作指令模板不相匹配。After receiving the confirmation instruction, proceeding to perform the step of determining whether the keyword obtained after the group word division matches the keyword arrangement order; and after receiving the negative instruction, determining the voice information and the operation instruction Templates do not match.
  5. 如权利要求3所述的方法,其特征在于,所述方法还包括:The method of claim 3, wherein the method further comprises:
    所述判断进行组词划分后得到的关键词是否包含在所述关键词词库中之前,判断进行组词划分后得到的关键词中是否包含指令关键词;Determining whether the keyword obtained after the group word division is included in the keyword lexicon, and determining whether the keyword obtained after the group word division includes the instruction keyword;
    若所述进行组词划分后得到的关键词中包含指令关键词,继续执行所述判断进行组词划分后得到的关键词是否包含在所述关键词词库中的步骤;若所述进行组词划分后得到的关键词中不包含指令关键词,确定所述语音信息与所述操作指令模板不相匹配。If the keyword obtained after the group word division includes the instruction keyword, the step of performing the judgment to perform the group word division is included in the keyword lexicon; if the group is performed The keyword obtained after the word division does not include the instruction keyword, and the voice information is determined to not match the operation instruction template.
  6. 一种语音识别装置,其特征在于,所述装置包括:A speech recognition apparatus, characterized in that the apparatus comprises:
    语音信息接收模块,用于接收语音信息; a voice information receiving module, configured to receive voice information;
    判断模块,用于判断所述语音信息与当前服务界面对应的操作指令模板是否匹配;a determining module, configured to determine whether the operation instruction template corresponding to the voice information and the current service interface matches;
    语音信息响应模块,用于在所述语音信息与所述操作指令模板相匹配时,执行所述语音信息指示的操作,在所述语音信息与所述操作执行模板不相匹配时,不执行操作。a voice information response module, configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and do not perform an operation when the voice information does not match the operation execution template .
  7. 如权利要求6所述的装置,其特征在于,所述操作指令模板,包括:关键词排列顺序和关键词词库。The apparatus according to claim 6, wherein the operation instruction template comprises: a keyword arrangement order and a keyword vocabulary.
  8. 如权利要求7所述的装置,其特征在于,所述判断模块,包括:The device of claim 7, wherein the determining module comprises:
    语音信息分析子模块,用于对所述语音信息进行组词划分;a voice information analysis sub-module, configured to perform group word division on the voice information;
    第一判断子模块,用于根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词是否包含在所述关键词词库中,在所述进行组词划分后得到的关键词包含在所述关键词词库中时,触发第二判断子模块执行操作,在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,确定所述语音信息与所述操作指令模板不相匹配;a first determining sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word segmentation, whether the keyword obtained after the group word segmentation is included in the keyword term library, When the keyword obtained after the group word division is included in the keyword vocabulary, the second judgment sub-module is triggered to perform an operation, and the keyword obtained after the group word division is not included in the keyword vocabulary Determining that the voice information does not match the operation instruction template;
    第二判断子模块,用于判断进行组词划分后得到的关键词是否与所述关键词排列顺序匹配,在进行组词划分后得到的关键词与所述关键词排列顺序相匹配时,确定所述语音信息与所述操作指令模板相匹配;在进行组词划分后得到的关键词与所述关键词排列顺序不相匹配时,确定所述语音信息与所述操作指令模板不相匹配。a second determining sub-module, configured to determine whether a keyword obtained after the group word division is matched with the keyword ranking order, and determining, when the keyword obtained after the group word division matches the keyword ranking order, determining The voice information is matched with the operation instruction template; when the keyword obtained after the group word division does not match the keyword arrangement order, it is determined that the voice information does not match the operation instruction template.
  9. 如权利要求8所述的装置,其特征在于,所述第一判断子模块,包括:The device of claim 8, wherein the first determining sub-module comprises:
    第一判断执行子模块,用于根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词是否包含在所述关键词词库中,在所述进行组词划分后得到的关键词包含在所述关键词词库中时,触发第二判断子模块执行操作,在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,触发显示子模块执行操作;a first judgment execution sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained after the group word division is included in the keyword vocabulary When the keyword obtained after the group word division is included in the keyword lexicon, the second judgment sub-module is triggered to perform an operation, and the keyword obtained after the group word division is not included in the keyword word. When the library is in the library, the trigger display sub-module performs the operation;
    显示子模块,用于在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,显示未包含在所述关键词词库中的进行组词划分得到的关键词;a display submodule, configured to display a keyword obtained by group word division not included in the keyword lexicon when the keyword obtained after the group word division is not included in the keyword vocabulary ;
    触发模块,用于在接收到确认指令之后,触发所述第二判断子模块执行操作;在接到否定指令之后,确定所述语音信息与所述操作指令模板不相匹配。And a triggering module, configured to: after receiving the confirmation instruction, trigger the second determining sub-module to perform an operation; after receiving the negative instruction, determine that the voice information does not match the operation instruction template.
  10. 如权利要求8所述的装置,其特征在于,所述第一判断子模块,包括:The device of claim 8, wherein the first determining sub-module comprises:
    第二判断执行子模块,用于根据所述进行组词划分后得到的关键词的分拆和组合,判断进行组词划分后得到的关键词中是否包含指令关键词,在所述进行组词划分后得到的关键词中包含指令关键词时,触发第三判断执行子模块执行操作,在所述进行组词划分后得到的关键词中不包含指令关键词时,确定所述语音信息与所述操作指令模板不相匹配;a second judgment execution sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained by the group word division includes the instruction keyword, and the group word is performed When the keyword obtained by the division includes the instruction keyword, the third judgment execution sub-module is triggered to perform an operation, and when the keyword obtained after the group word division does not include the instruction keyword, the voice information and the location are determined. The operation instruction templates do not match;
    第三判断执行子模块,用于判断进行组词划分后得到的关键词是否包含在所述关键词词库中,在所述进行组词划分后得到的关键词包含在所述关键词词库中时,触发第二判断子模块执行操作,在所述进行组词划分后得到的关键词未包含在所述关键词词库中时,确定所述语音信息与所述操作指令模板不相匹配。 a third judgment execution sub-module, configured to determine whether a keyword obtained after the group word division is included in the keyword vocabulary, and the keyword obtained after the group word division is included in the keyword lexicon In the middle, the second judgment sub-module is triggered to perform an operation, and when the keyword obtained after the group word division is not included in the keyword vocabulary, determining that the voice information does not match the operation instruction template .
PCT/CN2016/084463 2016-06-02 2016-06-02 Speech recognition method and device WO2017206133A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/084463 WO2017206133A1 (en) 2016-06-02 2016-06-02 Speech recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/084463 WO2017206133A1 (en) 2016-06-02 2016-06-02 Speech recognition method and device

Publications (1)

Publication Number Publication Date
WO2017206133A1 true WO2017206133A1 (en) 2017-12-07

Family

ID=60478412

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/084463 WO2017206133A1 (en) 2016-06-02 2016-06-02 Speech recognition method and device

Country Status (1)

Country Link
WO (1) WO2017206133A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019679A (en) * 2017-12-13 2019-07-16 联易软件有限公司 Food and drug administration method and apparatus
CN111063344A (en) * 2018-10-17 2020-04-24 青岛海信移动通信技术股份有限公司 Voice recognition method, mobile terminal and server

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001018990A1 (en) * 1999-09-02 2001-03-15 Sennheiser Electronic Gmbh & Co. Kg Personal information system, especially a personal guide system, for wirelessly transmitting voice information
CN101154379A (en) * 2006-09-27 2008-04-02 夏普株式会社 Method and device for locating keywords in voice and voice recognition system
CN101303692A (en) * 2008-06-19 2008-11-12 徐文和 All-purpose numeral semantic library for translation of mechanical language
CN103077714A (en) * 2013-01-29 2013-05-01 华为终端有限公司 Information identification method and apparatus
CN105161099A (en) * 2015-08-12 2015-12-16 恬家(上海)信息科技有限公司 Voice-controlled remote control device and realization method thereof
CN105426357A (en) * 2015-11-06 2016-03-23 武汉卡比特信息有限公司 Fast voice selection method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001018990A1 (en) * 1999-09-02 2001-03-15 Sennheiser Electronic Gmbh & Co. Kg Personal information system, especially a personal guide system, for wirelessly transmitting voice information
CN101154379A (en) * 2006-09-27 2008-04-02 夏普株式会社 Method and device for locating keywords in voice and voice recognition system
CN101303692A (en) * 2008-06-19 2008-11-12 徐文和 All-purpose numeral semantic library for translation of mechanical language
CN103077714A (en) * 2013-01-29 2013-05-01 华为终端有限公司 Information identification method and apparatus
CN105161099A (en) * 2015-08-12 2015-12-16 恬家(上海)信息科技有限公司 Voice-controlled remote control device and realization method thereof
CN105426357A (en) * 2015-11-06 2016-03-23 武汉卡比特信息有限公司 Fast voice selection method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019679A (en) * 2017-12-13 2019-07-16 联易软件有限公司 Food and drug administration method and apparatus
CN111063344A (en) * 2018-10-17 2020-04-24 青岛海信移动通信技术股份有限公司 Voice recognition method, mobile terminal and server
CN111063344B (en) * 2018-10-17 2022-06-28 青岛海信移动通信技术股份有限公司 Voice recognition method, mobile terminal and server

Similar Documents

Publication Publication Date Title
CN107832036B (en) Voice control method, device and computer readable storage medium
CN105489220B (en) Voice recognition method and device
RU2638011C2 (en) Method and device for processing introduced data
WO2016165325A1 (en) Audio information recognition method and apparatus
WO2018036392A1 (en) Voice-based information sharing method, device, and mobile terminal
US11749273B2 (en) Speech control method, terminal device, and storage medium
CN106791921A (en) The processing method and processing device of net cast
US20140358566A1 (en) Methods and devices for audio processing
KR102334299B1 (en) Voice information processing method, apparatus, program and storage medium
WO2021208531A1 (en) Speech processing method and apparatus, and electronic device
CN107945806B (en) User identification method and device based on sound characteristics
US11335348B2 (en) Input method, device, apparatus, and storage medium
WO2017092121A1 (en) Information processing method and device
CN107580129A (en) terminal state control method and device
CN107135147A (en) Method, device and the computer-readable recording medium of sharing position information
CN108766427B (en) Voice control method and device
CN111061452A (en) Voice control method and device of user interface
RU2643470C2 (en) Search method and search device
WO2017206133A1 (en) Speech recognition method and device
CN111739535A (en) Voice recognition method and device and electronic equipment
CN106098066B (en) Voice recognition method and device
CN107943317A (en) Input method and device
CN113936697A (en) Voice processing method and device for voice processing
CN109285545A (en) Information processing method and device
CN106060253B (en) Information presentation method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16903518

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16903518

Country of ref document: EP

Kind code of ref document: A1