WO2017206133A1

WO2017206133A1 - Speech recognition method and device

Info

Publication number: WO2017206133A1
Application number: PCT/CN2016/084463
Authority: WO
Inventors: 吴刚; 党君利; 柳义庆; 冯晓龙
Original assignee: 深圳市智物联网络有限公司
Priority date: 2016-06-02
Filing date: 2016-06-02
Publication date: 2017-12-07

Abstract

A method and device for speech recognition, wherein different operation instruction templates are set for different service interfaces, taking the operation instruction template corresponding to the current service interface as a reference, determining whether the received speech information matches the operation instruction template; execute the operation instructed by the speech information only if the match is successful, so as to prevent taking the input approximate speech information as an operation instruction in the presence of multiple voices to interrupt the service currently provided, thereby accurately recognizing the contents of the operation instruction in the speech information.

Description

Speech recognition method and device

Technical field

The present invention relates to the field of voice technologies, and in particular, to a voice recognition method and apparatus.

Background technique

With the development of multimedia technology, the service items of multimedia systems have also expanded, such as music, video, pictures, real-time road condition signals, destination map navigation, voice navigation and so on. The extensive use of intelligent terminals provides a broad space for development of the above service projects.

Regardless of whether the terminal has a button or a touch screen, manual operation is required to use the above service items, which is not only cumbersome, but also dangerous. For example, the driver may be dangerous when manually operating the vehicle equipment during driving. The development of speech recognition technology has provided a new direction for such operations. However, in a small internal space, such as a car, when speech recognition is used to use the above service items, multiple sounds will occur at the same time. How to accurately recognize the content of the operation instructions in the voice information becomes an urgent problem to be solved.

Summary of the invention

Embodiments of the present invention provide a voice recognition method capable of accurately identifying an operation instruction content in voice information when multiple voices exist.

The embodiment of the invention further provides a speech recognition device capable of accurately recognizing the content of the operation instruction in the speech information when there are multiple sounds.

The voice recognition method provided by the embodiment of the present invention includes:

Receiving voice information;

Determining whether the operation instruction template corresponding to the voice information and the current service interface matches;

And if the voice information matches the operation instruction template, performing an operation indicated by the voice information, and if the voice information does not match the operation execution template, the operation is not performed.

It can be seen that, in the embodiment of the present invention, different operation instruction templates are set for different service interfaces, and the operation instruction template corresponding to the current service interface is used as a standard to determine whether the received voice information matches the operation instruction template, and if the matching is successful, The operation indicated by the voice information is performed, thereby avoiding the input of the approximate voice information as an operation instruction when the multiple voices exist, interrupting the service currently being provided, and accurately identifying the content of the operation instruction in the voice information.

As an optional implementation manner, the operation instruction template includes: a keyword arrangement order and a keyword vocabulary.

It can be seen that the operation instruction template in the embodiment of the present invention not only includes the keyword lexicon, but also includes the keyword arrangement order, thereby improving the standard matching with the operation instruction template, and more accurately identifying the operation instruction content in the voice information. .

As an optional implementation manner, the determining whether the voice information matches the operation instruction template, include:

Performing group word division on the voice information;

Determining whether the keywords obtained after the group word division are included in the keyword vocabulary are determined according to the splitting and combining of the keywords obtained after the group word division;

If the keyword obtained after the group word division is included in the keyword vocabulary, it is determined whether the keyword obtained after the group word division is matched with the keyword arrangement order; if the group word division is obtained The keyword is matched with the keyword arrangement order, and the voice information is determined to match the operation instruction template; if the keyword obtained after group word division does not match the keyword arrangement order, determining the The voice information does not match the operation instruction template;

If the keyword obtained after the group word division is not included in the keyword vocabulary, it is determined that the voice information does not match the operation instruction template.

It can be seen that the voice information segmentation technology is adopted in the embodiment of the present invention, and the received voice information is divided into group words to realize the effect of accurately identifying the voice information.

As an optional implementation manner, the method further includes:

If the keyword obtained after the group word division is not included in the keyword vocabulary, the keyword obtained by group word division not included in the keyword vocabulary is displayed;

After receiving the confirmation instruction, proceeding to perform the step of determining whether the keyword obtained after the group word division matches the keyword arrangement order; and after receiving the negative instruction, determining the voice information and the operation instruction Templates do not match.

It can be seen that, in the embodiment of the present invention, when the keyword obtained after the group word division is not included in the keyword vocabulary, the keyword may be further displayed, and if the confirmation instruction is received, the execution of the keyword is continued to determine whether the keyword is The step of matching the keyword arrangement order, thereby avoiding the keyword lexical incompleteness, and making an incorrect judgment on some keywords obtained after group word division.

As an optional implementation manner, the method further includes:

Determining whether the keyword obtained after the group word division is included in the keyword lexicon, and determining whether the keyword obtained after the group word division includes the instruction keyword;

If the keyword obtained after the group word division includes the instruction keyword, the step of performing the judgment to perform the group word division is included in the keyword lexicon; if the group is performed The keyword obtained after the word division does not include the instruction keyword, and the voice information is determined to not match the operation instruction template.

It can be seen that, in the embodiment of the present invention, it is first determined whether the instruction keyword is included in the voice information, and only if the instruction keyword is included, whether the keyword in the voice information is included in the keyword vocabulary is further determined, thereby improving processing efficiency. .

An embodiment of the present invention provides a voice recognition apparatus, including:

a voice information receiving module, configured to receive voice information;

a determining module, configured to determine whether the operation instruction template corresponding to the voice information and the current service interface matches;

a voice information response module, configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and do not perform an operation when the voice information does not match the operation execution template .

As an optional implementation manner, the determining module includes:

a voice information analysis sub-module, configured to perform group word division on the voice information; a first judgment sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word division, determine the group word division Whether the obtained keyword is included in the keyword vocabulary, and when the keyword obtained after the group word division is included in the keyword vocabulary, triggering the second determining sub-module to perform an operation, When the keyword obtained after the group word division is not included in the keyword vocabulary, it is determined that the voice information does not match the operation instruction template;

a second determining sub-module, configured to determine whether a keyword obtained after the group word division is matched with the keyword ranking order, and determining, when the keyword obtained after the group word division matches the keyword ranking order, determining The voice information is matched with the operation instruction template; when the keyword obtained after the group word division does not match the keyword arrangement order, it is determined that the voice information does not match the operation instruction template.

As an optional implementation manner, the first determining submodule includes:

a first judgment execution sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained after the group word division is included in the keyword vocabulary When the keyword obtained after the group word division is included in the keyword lexicon, the second judgment sub-module is triggered to perform an operation, and the keyword obtained after the group word division is not included in the keyword word. When the library is in the library, the trigger display sub-module performs the operation;

a display submodule, configured to display a keyword obtained by group word division not included in the keyword lexicon when the keyword obtained after the group word division is not included in the keyword vocabulary ;

And a triggering module, configured to: after receiving the confirmation instruction, trigger the second determining sub-module to perform an operation; after receiving the negative instruction, determine that the voice information does not match the operation instruction template.

As an optional implementation manner, the first determining submodule includes:

a second judgment execution sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained by the group word division includes the instruction keyword, and the group word is performed When the keyword obtained by the division includes the instruction keyword, the third judgment execution sub-module is triggered to perform an operation, and when the keyword obtained after the group word division does not include the instruction keyword, the voice information and the location are determined. The operation instruction templates do not match;

a third judgment execution sub-module, configured to determine whether a keyword obtained after the group word division is included in the keyword vocabulary, and the keyword obtained after the group word division is included in the keyword lexicon In the middle, the second judgment sub-module is triggered to perform an operation, and when the keyword obtained after the group word division is not included in the keyword vocabulary, determining that the voice information does not match the operation instruction template .

DRAWINGS

The accompanying drawings, which are incorporated in and constitute in the claims

1 is a flowchart of a method for voice recognition according to an embodiment of the present invention;

2 is a flowchart of a method for voice recognition according to an embodiment of the present invention;

2A is a schematic diagram of a system interface in an embodiment of the present invention;

3 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention;

4 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention;

FIG. 5 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention; FIG.

FIG. 5A is a block diagram of a voice recognition apparatus according to an embodiment of the present invention; FIG.

FIG. 6 is a block diagram of an apparatus 600 for speech recognition, according to an exemplary embodiment.

detailed description

Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. The following description refers to the same or similar elements in the different figures unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Instead, they are merely examples of devices and methods consistent with aspects of the invention as detailed in the appended claims.

FIG. 1 is a flowchart of a method for voice recognition according to an embodiment of the present invention, which may be applied to a terminal.

In step 11, voice information is received.

In step 12, it is determined whether the operation information template corresponding to the current service interface matches the voice information, and if yes, step 13 is performed; otherwise, the operation is not performed.

In step 13, the operation indicated by the voice information is performed.

The operation instruction template in the embodiment of the present invention may include: a keyword arrangement order and a keyword vocabulary. Different service interfaces correspond to different operation instruction templates. For example, the navigation service interface corresponds to one operation instruction template, and the music service interface corresponds to another operation instruction template.

Take the navigation service as an example. The operation instruction template corresponding to the navigation service interface is shown in Table 1.

Table I

Take the music service as an example, the operation instruction template corresponding to the music service interface is shown in Table 2.

Table II

In the keyword templates shown in Tables 1 and 2 above, there is a type of instruction keyword, such as "navigation to" in the navigation operation instruction template, and "play" in the music operation instruction template, for example. It can be seen that the instruction keyword is usually dynamic. word.

FIG. 2 is a flowchart of a method for voice recognition according to an embodiment of the present invention, and the method may be applied to a terminal.

In step 21, voice information is received.

In step 22, an operation instruction template corresponding to the current service interface is determined.

As an optional implementation manner, when the terminal user wants to use the service, the interface wake-up command of the voice may be input on the system interface as shown in FIG. 2A, and the system interface displays the current user in a form of a service channel. service. For example, when you want to use the music service, the voice input "opens the music interface", and when you want to use the navigation service, the voice input "opens the navigation interface". After receiving the interface wake-up instruction, the terminal opens a current service interface corresponding to the interface wake-up instruction, and subsequent operations are performed based on the opened current service interface. The correspondence between the service interface and the operation instruction template is saved in the terminal, so according to the current service interface, the operation instruction template corresponding to the current service interface can be determined.

In step 23, the received voice information is grouped.

As an optional implementation manner, the voice information segmentation technology is adopted, and the received voice information is grouped and divided, and the keyword is separated and combined.

In step 24, it is determined whether the keyword obtained after the group word division is in the keyword lexicon, and when the keyword obtained after the group word division is in the keyword vocabulary, step 25 is performed, and after the group word is divided, When the keyword is not in the keyword lexicon, no action is performed.

As an optional alternative embodiment, in step 24, when the keyword obtained after the group word division is not in the keyword vocabulary, the keyword not included in the keyword lexicon may be displayed and given to the user. Providing a confirmation or negative function option, after the user confirms the keyword, the terminal will receive the confirmation command, and can continue to perform step 25 at this time. After the user denies the keyword, the terminal will receive a negative instruction, and the terminal will not execute at this time. operating. Therefore, when the keyword lexicon is not complete, some keywords cannot be recognized. Further, after the user confirms the keyword, the keyword can be updated into the keyword lexicon. Optionally, the user can use the voice input to confirm or negate the instruction.

As another optional alternative embodiment, before determining whether the keyword obtained after the group word division is in the keyword lexicon, it is first determined whether the keyword obtained by the group word division includes the instruction keyword, and only When it is determined that the keyword obtained after the group word division includes the instruction keyword, the step of determining whether the keyword obtained after the group word division is in the keyword lexicon is performed, and if the group word division is performed, If the keyword is not included in the keyword, it can be directly determined that the received voice information does not match the last command of the operation command. Therefore, the instruction keyword is included in the received voice information to match the keyword lexicon, thereby improving the processing efficiency.

In step 25, it is determined whether the keyword obtained after the group word division is matched with the keyword arrangement order, and when the keyword obtained after the group word division matches the keyword arrangement order, the received voice information and the operation instruction are determined. If the templates match, step 26 is executed. When the keywords obtained after the group word division do not match the keyword arrangement order, the operation is not performed.

In step 26, the operation indicated by the voice information is performed.

According to the method shown in FIG. 1 or FIG. 2, several specific application scenarios are given below. Take the terminal as an in-vehicle device as an example.

When the driver wants to use the navigation service, the voice input interface wakes up the instruction “open the navigation interface”, and after receiving the interface wake-up command, the in-vehicle device opens the navigation service interface. After the navigation service interface is opened, the driver can continue to input the voice information “Navigate to Tiananmen Square”, and the in-vehicle device determines that the voice information matches the operation instruction template corresponding to the navigation service interface, and performs a corresponding navigation operation. In the process of providing navigation services, it is assumed that other passengers and drivers in the car talk about tourist attractions, and it is possible to mention multiple place names. In this case, as long as the voice information received by the in-vehicle device does not conform to the format of “navigate to place name”, No operation is performed, and it is avoided that a new navigation instruction is mistaken when receiving a place name of another voice input in a small space inside the vehicle, thereby interrupting the navigation service currently being performed.

When the driver wants to use the music service, the voice input interface wakes up the instruction “Open the music interface”, and after receiving the interface wake-up command, the in-vehicle device opens the music service interface. After the music service interface is opened, the driver can continue to input the voice information "Play Song 1", and the in-vehicle device determines that the voice information matches the operation instruction template corresponding to the music service interface, and performs a corresponding music playing operation. In the process of providing music playback, it is assumed that other passengers and drivers in the car talk about the current popular songs, and it is possible to mention multiple song names. In this case, as long as the voice information received by the in-vehicle device does not conform to the format of "playing song name". , no operation is performed, and it is avoided that a new play command is mistakenly recognized when a song name of another voice input is received in a small space inside the vehicle, thereby interrupting the currently playing music play service.

An example of a speech recognition apparatus in an embodiment of the present invention, which can implement the speech recognition method described above, is given below. The individual modules or sub-modules of these devices correspond to the corresponding steps in the method flow, and the relevant detailed explanations have been given above, and will not be described below.

FIG. 3 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention. The apparatus may be located in a terminal, and includes: a voice information receiving module 31, a determining module 32, and a voice information response module 33.

The voice information receiving module 31 is configured to receive voice information.

The determining module 32 is configured to determine whether the operation instruction template corresponding to the current service interface matches the voice information, and send the determination result to the voice information response module 33.

The voice information response module 33 is configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and not execute when the voice information does not match the operation execution template operating.

FIG. 4 is a block diagram of a voice recognition apparatus according to an embodiment of the present invention. The apparatus may be located in a terminal, and includes: a voice information receiving module 31, a determining module 32, a voice information response module 33, and a wakeup module 34.

The operation instruction module in the embodiment of the present invention may include: a keyword arrangement order and a keyword vocabulary.

The determining module 32 can include a voice information analyzing sub-module 321, a first determining sub-module 322, and a second determining sub-module 323.

The voice information analysis sub-module 321 is configured to perform group word division on the voice information.

The first judging sub-module 322 is configured to determine, according to the splitting and combining of the keywords obtained after the group word segmentation, whether the keyword obtained after the group word segmentation is included in the keyword vocabulary, When the keyword obtained after the group word division is included in the keyword vocabulary, the second judgment sub-module 323 is triggered to perform an operation, where the operation is performed. When the keyword obtained after the group word division is not included in the keyword vocabulary, it is determined that the voice information does not match the operation instruction template.

As an optional alternative, in order to prevent the keyword lexicon from being incomplete, when the determining module 322 determines that the keyword obtained after the group word division is not included in the keyword vocabulary, the user may also be given to the user. Provides optional features to display confirmation. In this case, the first determining sub-module 322 may further include: a first determining execution sub-module 3221, a display sub-module 3222, and a triggering module 3223. The block diagram of the device containing this part is shown in Figure 5.

The first judgment execution sub-module 3221 is configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained after the group word division is included in the keyword vocabulary, When the keyword obtained after the group word division is included in the keyword vocabulary, the second judgment sub-module 323 is triggered to perform an operation, and the keyword obtained after the group word division is not included in the key When in the word dictionary, the trigger display sub-module 3222 performs an operation.

The display sub-module 3222 is configured to display, when the keyword obtained after the group word division is not included in the keyword vocabulary, a key that is not included in the keyword lexicon for group word division word.

The triggering module 3223 is configured to: after receiving the confirmation instruction, trigger the second determining sub-module 323 to perform an operation; after receiving the negative instruction, determine that the voice information does not match the operation instruction template.

As another optional implementation manner, in order to improve processing efficiency, the first determining submodule 322 may further include: a second determining executing submodule 3224 and a third determining executing submodule 3225. The block diagram of the device containing this part is shown in Figure 5A.

The second judgment execution sub-module 3224 is configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained by the group word division includes the instruction keyword, and the group is performed in the group When the keyword obtained by the word division includes the instruction keyword, the third judgment execution sub-module 3225 is triggered to perform an operation, and when the keyword obtained after the group word division does not include the instruction keyword, the voice information is determined. Does not match the operation instruction template.

The third judgment execution sub-module 3225 is configured to determine whether the keyword obtained after the group word division is included in the keyword vocabulary, and the keyword obtained after the group word division is included in the keyword word When the library is in the library, the second judgment sub-module 323 is triggered to perform an operation. When the keyword obtained after the group word division is not included in the keyword vocabulary, the voice information and the operation instruction template are determined not to be Match.

The second judging sub-module 323 is configured to determine whether the keyword obtained after the group word division is matched with the keyword arrangement order, and when the keyword obtained after the group word division is matched with the keyword arrangement order, Determining that the voice information matches the operation instruction template; and determining that the voice information matches the operation instruction template when the keyword obtained after the group word division does not match the keyword arrangement order.

The voice information response module 33 is configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and does not perform an operation when the voice information does not match the operation execution template.

The wake-up module 34 is configured to receive an interface wake-up instruction, and open the current service interface corresponding to the interface wake-up instruction.

FIG. 6 is a block diagram of an apparatus 600 for speech recognition, according to an exemplary embodiment. For example, device 600 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.

Referring to Figure 6, apparatus 600 can include one or more of the following components: processing component 602, memory 604, power component 606, multimedia component 608, audio component 610, input/output (I/O) interface 612, sensor component 614, And a communication component 616.

Processing component 602 typically controls the overall operation of device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. Processing component 602 can include one or more processors 620 to execute instructions to perform all or part of the steps of the speech recognition method described above. Moreover, processing component 602 can include one or more modules to facilitate interaction between component 602 and other components. For example, processing component 602 can include a multimedia module to facilitate interaction between multimedia component 608 and processing component 602.

Memory 604 is configured to store various types of data to support operation at device 600. Examples of such data include instructions for any application or method operating on device 600, contact data, phone book data, messages, pictures, videos, and the like. The memory 604 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.

Power component 606 provides power to various components of device 600. Power component 606 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 600.

The multimedia component 608 includes a screen between the device 600 and the user that provides an output interface. In some embodiments, the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 608 includes a front camera and/or a rear camera. When the device 600 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 610 is configured to output and/or input an audio signal. For example, audio component 610 includes a microphone (MIC) that is configured to receive an external audio signal when device 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may be further stored in memory 604 or transmitted via communication component 616. In some embodiments, audio component 610 also includes a speaker for outputting an audio signal.

The I/O interface 612 provides an interface between the processing component 602 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.

Sensor assembly 614 includes one or more sensors for providing device 600 with a status assessment of various aspects. For example, sensor component 614 can detect an open/closed state of device 600, a relative positioning of components, such as the display and keypad of device 600, and sensor component 614 can also detect a change in position of one component of device 600 or device 600. The presence or absence of contact by the user with the device 600, the orientation or acceleration/deceleration of the device 600 and the temperature change of the device 600. Sensor assembly 614 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor component 614 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

Communication component 616 is configured to facilitate wired or wireless communication between device 600 and other devices. The device 600 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, communication component 616 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 616 also includes a near field communication (NFC) module to facilitate short range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, device 600 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.

In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium comprising instructions, such as a memory 604 comprising instructions executable by processor 620 of apparatus 600 to perform the above method. For example, the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.

Other embodiments of the invention will be apparent to those skilled in the <RTIgt; The description is intended to cover any variations, uses, or adaptations of the invention, which are in accordance with the general principles of the invention and include common general knowledge or common technical means in the art that are not disclosed. The examples are to be considered as illustrative only, and the true scope and spirit of the invention are indicated by the claims.

Claims

A speech recognition method, characterized in that the method comprises:

Receiving voice information;

Determining whether the operation instruction template corresponding to the voice information and the current service interface matches;

And if the voice information matches the operation instruction template, performing an operation indicated by the voice information, and if the voice information does not match the operation execution template, the operation is not performed.
The method of claim 1, wherein the operation instruction template comprises: a keyword arrangement order and a keyword vocabulary.
The method of claim 2, wherein the determining whether the voice information matches the operation instruction template comprises:

Performing group word division on the voice information;

Determining whether the keywords obtained after the group word division are included in the keyword vocabulary are determined according to the splitting and combining of the keywords obtained after the group word division;

If the keyword obtained after the group word division is included in the keyword vocabulary, it is determined whether the keyword obtained after the group word division is matched with the keyword arrangement order; if the group word division is obtained The keyword is matched with the keyword arrangement order, and the voice information is determined to match the operation instruction template; if the keyword obtained after group word division does not match the keyword arrangement order, determining the The voice information does not match the operation instruction template;

If the keyword obtained after the group word division is not included in the keyword vocabulary, it is determined that the voice information does not match the operation instruction template.
The method of claim 3, wherein the method further comprises:

If the keyword obtained after the group word division is not included in the keyword vocabulary, the keyword obtained by group word division not included in the keyword vocabulary is displayed;

After receiving the confirmation instruction, proceeding to perform the step of determining whether the keyword obtained after the group word division matches the keyword arrangement order; and after receiving the negative instruction, determining the voice information and the operation instruction Templates do not match.
The method of claim 3, wherein the method further comprises:

Determining whether the keyword obtained after the group word division is included in the keyword lexicon, and determining whether the keyword obtained after the group word division includes the instruction keyword;

If the keyword obtained after the group word division includes the instruction keyword, the step of performing the judgment to perform the group word division is included in the keyword lexicon; if the group is performed The keyword obtained after the word division does not include the instruction keyword, and the voice information is determined to not match the operation instruction template.
A speech recognition apparatus, characterized in that the apparatus comprises:

a voice information receiving module, configured to receive voice information;

a determining module, configured to determine whether the operation instruction template corresponding to the voice information and the current service interface matches;

a voice information response module, configured to perform an operation indicated by the voice information when the voice information matches the operation instruction template, and do not perform an operation when the voice information does not match the operation execution template .
The apparatus according to claim 6, wherein the operation instruction template comprises: a keyword arrangement order and a keyword vocabulary.
The device of claim 7, wherein the determining module comprises:

a voice information analysis sub-module, configured to perform group word division on the voice information;

a first determining sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word segmentation, whether the keyword obtained after the group word segmentation is included in the keyword term library, When the keyword obtained after the group word division is included in the keyword vocabulary, the second judgment sub-module is triggered to perform an operation, and the keyword obtained after the group word division is not included in the keyword vocabulary Determining that the voice information does not match the operation instruction template;

a second determining sub-module, configured to determine whether a keyword obtained after the group word division is matched with the keyword ranking order, and determining, when the keyword obtained after the group word division matches the keyword ranking order, determining The voice information is matched with the operation instruction template; when the keyword obtained after the group word division does not match the keyword arrangement order, it is determined that the voice information does not match the operation instruction template.
The device of claim 8, wherein the first determining sub-module comprises:

a first judgment execution sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained after the group word division is included in the keyword vocabulary When the keyword obtained after the group word division is included in the keyword lexicon, the second judgment sub-module is triggered to perform an operation, and the keyword obtained after the group word division is not included in the keyword word. When the library is in the library, the trigger display sub-module performs the operation;

a display submodule, configured to display a keyword obtained by group word division not included in the keyword lexicon when the keyword obtained after the group word division is not included in the keyword vocabulary ;

And a triggering module, configured to: after receiving the confirmation instruction, trigger the second determining sub-module to perform an operation; after receiving the negative instruction, determine that the voice information does not match the operation instruction template.
The device of claim 8, wherein the first determining sub-module comprises:

a second judgment execution sub-module, configured to determine, according to the splitting and combining of the keywords obtained after the group word division, whether the keyword obtained by the group word division includes the instruction keyword, and the group word is performed When the keyword obtained by the division includes the instruction keyword, the third judgment execution sub-module is triggered to perform an operation, and when the keyword obtained after the group word division does not include the instruction keyword, the voice information and the location are determined. The operation instruction templates do not match;

a third judgment execution sub-module, configured to determine whether a keyword obtained after the group word division is included in the keyword vocabulary, and the keyword obtained after the group word division is included in the keyword lexicon In the middle, the second judgment sub-module is triggered to perform an operation, and when the keyword obtained after the group word division is not included in the keyword vocabulary, determining that the voice information does not match the operation instruction template .