Method for standardizing voice recognition instruction and operation instruction of voice recognition equipment
Technical Field
The application belongs to the technical field of electronics and computers, and relates to a method for standardizing a voice recognition instruction and an operation instruction of voice recognition equipment.
Background
With the coming of the internet + era and the rapid development of the internet of things industry, the market of various industries has increasingly increased demand for internet of things equipment. The small-sized Internet of things equipment with low power consumption, small volume and easy use in the Internet of things equipment is more attractive to users and popular in the market. Among them, in the internet of things devices in industries such as smart home and smart industry, voice recognition devices are popular with users. The problem that the memory of the voice recognition device is small enough to store more voice instructions can occur in the production and manufacturing processes of the voice recognition device.
The internal memory of the voice recognition device of the internet of things comprises two important parts, wherein one part is used for storing the recognition instruction, and the other part is used for storing the operation instruction. Generally, the instructions to be recognized are listed, item by item, for the part of the recognized instructions, which occupies a large part of the memory address to store the recognized instructions. This also limits the speech recognition capabilities of the speech recognition device because of the limited memory size of the speech recognition device, thereby limiting the recognition and functionality of the speech recognition device.
Chinese patent (title: a training or adaptation method of a voice recognition apparatus, application No. CN02127545.9, application date: 2002.05.08, publication No. CN1391210A) discloses a training or adaptation method of a voice recognition apparatus, which proposes a method of performing voice input and then generating a recognition result in order to improve the comfort of a user when training and adapting using the voice recognition apparatus, and does not relate to a program and a recognition specification in the voice recognition apparatus;
chinese patent (title: a voice activation method and system, application No. CN201410418850.3, application date: 2014.08.22, publication No. CN105374352A) discloses a voice activation method, comprising: the method of establishing acoustic models, optimizing voice streams and extracting voice features according to different environments, obtaining recognized voice phonemes and the like is focused on a method in the voice recognition models, and is not much related to a method for specifying voice recognition instructions and operation instructions.
Chinese patent (title: update method, apparatus and system of voice recognition device, application No. cn201310163915.x, application date: 2013.05.07, publication No. CN103247291A) discloses an update method, apparatus and system of voice recognition device, in which the method of voice recognition is discussed: a voice recognition method combining a local voice recognition device and a cloud voice recognition device; the method does not involve recognition of voice commands and operation commands.
Chinese patent (title: speech recognition method, speech recognition device and electronic device, application No. CN201310573521.1, application date: 2013.11.14, publication No. CN103632666A) discloses a speech recognition method, a speech recognition device and an electronic device, in which the speech recognition method refers to correcting a speech to be recognized by a speech engine using the speech correction instruction, and then performing speech recognition. That is, the method is used before speech recognition and does not involve recognition of speech commands and operation commands.
Chinese patent (title: a speech recognition type programming method, apparatus and computer equipment, application No. CN201810686496.0, application date: 2018.06.28, publication No. CN108845797A) discloses a speech recognition type programming method, comprising: acquiring voice content; and calling related programming instructions according to the voice content, and combining the programming instructions according to the sequence of the input voice content to form a program. The programming method in the method is suitable for programming teaching, and is mainly a method for completing a voice recognition program by searching stored commands after voice equipment receives a voice instruction; there is no recognition of voice commands and operating commands involved.
Chinese patent (title: a speech recognition method, device, application number: CN201910913836.3, application date: 2019.09.25, publication number: CN110473543A) is a method for the whole speech recognition process, not a normative method for the speech recognition instruction and the operation instruction in the speech recognition program.
At present, the memory size of the internet of things equipment has the limitation of capital cost and volume size, and the embedded program of the voice recognition equipment adopts a list-by-list mode to allocate memory addresses to recognition instructions when allocating the memory addresses. Under the condition that the memory size of the voice recognition equipment is fixed, the existing method is adopted to allocate addresses to voice recognition instructions, the voice recognition equipment only has a fixed number of voice recognition instructions, and the memory of the equipment is almost occupied. In summary, a speech recognition device that allocates memory to recognition instructions on a case-by-case basis has the following disadvantages: 1) the expansibility is poor; 2) identifying that the content is less; 3) the memory is over-full and the running speed is slow.
Disclosure of Invention
The application aims at providing a method for standardizing a voice recognition instruction and an operation instruction of voice recognition equipment, the method can be used for standardizing the recognition instruction and the operation instruction of voice recognition equipment of the Internet of things, so that the problem that a large program cannot be stored or embedded due to insufficient memory of the voice recognition equipment of the Internet of things is solved, the memory of the voice recognition equipment of the Internet of things can be effectively utilized, the money cost of the voice recognition equipment of the Internet of things is reduced to a great extent, and meanwhile, a better embedded program specification is provided for producing small-sized equipment of the Internet of things.
The technical scheme adopted by the application is that the method for standardizing the voice recognition instruction and the operation instruction of the voice recognition equipment specifically comprises the following steps:
step 1, collecting and classifying voice recognition instructions and operation instructions;
step 2, writing a voice recognition instruction program according to the voice recognition instruction classified in the step 1;
step 3, compiling a program of the operation instruction according to the operation instruction classified in the step 1;
and 4, operating the voice recognition equipment according to the programs written in the steps 2 and 3.
Preferably, the specific process of step 1 is:
step 1.1, collecting voice instructions received by the voice recognition equipment;
and 1.2, classifying the voice recognition instruction and the operation instruction respectively based on the voice instruction collected in the step.
Preferably, in step 1.2, the voice recognition instruction and the operation instruction are classified and then divided into three parts or two parts (referred to as "trisection method");
when the voice recognition instruction and the operation instruction are three parts, the voice recognition instruction and the operation instruction comprise a wakeup word part, an action part and an action degree part;
when the voice recognition instruction and the operation instruction are both two parts, the voice recognition instruction comprises a wakeup word part and an action part;
and when the voice recognition instruction and the operation instruction are classified, the voice recognition instruction corresponds to the operation instruction one by one.
Preferably, the specific process of step 2 is:
step 2.1, writing a voice recognition instruction only comprising two parts;
compiling the awakening words and the action parts of the voice recognition instruction in sequence;
step 2.2, writing a voice recognition instruction containing three parts;
firstly, compiling an action part of a voice recognition instruction; and then writing an action degree part corresponding to the action part in the voice recognition instruction.
Preferably, the wakeup word part of step 2.2 adopts the wakeup word part program compiled in step 2.1.
The specific process of the step 3 is as follows:
step 3.1, writing an operation instruction only comprising two parts;
compiling an operation instruction after the voice recognition equipment recognizes the awakening word; compiling an operation instruction after the voice recognition equipment recognizes the action part;
step 3.2, writing an operation instruction containing three parts;
compiling an operation instruction after the voice recognition equipment recognizes the action part; and then writing an operation instruction of the action degree part corresponding to the action part.
In step 3.2, the operation instruction after the voice recognition device recognizes the awakening word adopts the awakening word operation instruction program compiled in step 3.1.
The specific process of the step 4 is as follows:
step 4.1, the voice recognition device receives the voice command, and then judges whether a corresponding awakening word part exists in the received voice command, if the voice recognition device does not recognize the awakening word in the voice command, the voice device waits for the next voice command to recognize the awakening word again; if the voice recognition device has the awakening word recognized in the voice command, the voice recognition device executes the step 4.2;
step 4.2, the voice recognition device recognizes the action part and the action degree part in the voice command, and when the voice recognition device recognizes the action part and the action part has the corresponding action degree part, the step 4.3 is executed; when the voice recognition device recognizes the action part and the action part does not have the corresponding action degree part, executing the step 4.4, and when the voice recognition device does not recognize the action part, circularly executing the step 4.2 until the action part of the voice instruction is recognized;
step 4.3, the recognition of the action degree part in the voice command is executed, and when the voice recognition equipment recognizes the action degree part of the voice command, the step 4.5 is executed; when the voice recognition device does not recognize the action degree part of the voice command, the step 4.3 is executed in a circulating way;
step 4.4, executing the operation instruction of the action part corresponding to the voice recognition instruction, and finishing the whole operation process after executing the operation instruction of the action part in the step;
step 4.5, executing the operation instruction of the action part corresponding to the voice recognition instruction, and continuing to execute the step 4.6 after the operation instruction of the action part in the step is executed;
and 4.6, executing the received operation instruction of the action degree part corresponding to the action part in the voice instruction, and finishing the whole operation process after the operation instruction of the action degree part in the step is executed.
The beneficial effect of this application is as follows:
(1) the recognition content of the voice recognition equipment can be greatly increased to a certain extent;
(2) the expandability of the voice recognition equipment is increased to a certain extent;
(3) the problem that the memory of the voice recognition equipment is over-full and the running speed is low is solved;
(4) the requirement of the voice recognition equipment on the memory is reduced to a certain extent, and the cost is reduced.
Drawings
FIG. 1 is a flow chart of a method for specifying a speech recognition command and an operation command of a speech recognition device according to the present application, in which a "trisection" method is used to write a speech recognition program;
FIG. 2 is a flowchart illustrating operation of a speech recognition program in a method for specifying speech recognition commands and operation commands of a speech recognition device according to the present application;
fig. 3 is a comparison diagram of the number of instructions written by the intelligent vehicle speech recognition program in a bar-by-bar listing method and a "trisection method" in two applications in the embodiment of the method for standardizing the speech recognition instruction and the operation instruction of the speech recognition device.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application are clearly and completely described below with reference to the drawings in the embodiments of the present application.
The application relates to a method for standardizing a voice recognition instruction and an operation instruction of voice recognition equipment, which specifically comprises the following steps of:
step 1, collecting and classifying voice recognition instructions and operation instructions;
the specific process of the step 1 is as follows:
step 1.1, collecting voice instructions received by the voice recognition equipment; the same sentence structure and recognition order will be found in most (more than 95%) speech recognition commands;
and 1.2, classifying the voice recognition instruction and the operation instruction respectively based on the voice instruction collected in the step.
In step 1.2, the voice recognition instruction and the operation instruction can be divided into three parts or two parts after being classified;
when the voice recognition instruction and the operation instruction are three parts, the voice recognition instruction and the operation instruction comprise a wakeup word part, an action part and an action degree part;
when the voice recognition instruction and the operation instruction are both two parts, the voice recognition instruction comprises a wakeup word part and an action part;
and when the voice recognition instruction and the operation instruction are classified, the voice recognition instruction corresponds to the operation instruction one by one.
Step 2, writing a voice recognition instruction program according to the voice recognition instruction classified in the step 1;
the specific process of the step 2 is as follows:
step 2.1, writing a voice recognition instruction only comprising two parts;
compiling the awakening words and the action parts of the voice recognition instruction in sequence;
step 2.2, writing a voice recognition instruction containing three parts;
firstly, compiling an action part of a voice recognition instruction; and then writing an action degree part corresponding to the action part in the voice recognition instruction. The awakening word part of the step 2.2 adopts the programmed awakening word part program of the step 2.1.
Step 3, compiling a program of the operation instruction according to the operation instruction classified in the step 1;
the specific process of the step 3 is as follows:
step 3.1, writing an operation instruction only comprising two parts;
compiling an operation instruction after the voice recognition equipment recognizes the awakening word; compiling an operation instruction after the voice recognition equipment recognizes the action part;
step 3.2, writing an operation instruction containing three parts;
compiling an operation instruction after the voice recognition equipment recognizes the action part; and then writing an operation instruction of the action degree part corresponding to the action part. In step 3.2, the operation instruction after the voice recognition device recognizes the awakening word adopts the awakening word operation instruction program compiled in step 3.1.
And 4, operating the voice recognition equipment according to the programs written in the steps 2 and 3.
The specific process of the step 4 is as follows: and 4.1, firstly, the voice recognition equipment receives a voice command, and then, whether a corresponding awakening word part exists in the received voice command is judged. If the voice recognition device does not recognize the awakening words in the voice command, the voice device waits for the next voice command to recognize the awakening words again; if the voice recognition device has the awakening word recognized in the voice command, the voice device can execute the next recognition step;
in step 4.2, the voice device recognizes the action and action degree part in the voice command, and the recognition of the action part and the recognition of the action degree part are also corresponding to each other. If the program identifies an action part in this step, it will be judged if there is an action degree part corresponding to the action part, if there is a corresponding action degree part for the action part, then an identification step 4.3 of the action degree part will be executed; if the action part has no action degree part, the step 4.4 of the operation instruction part is executed; if no action part is recognized in the step, the step is circulated until the action part of the voice command is recognized;
step 4.3, if the action part of the voice command in the previous step has a corresponding action degree part, the identification of the action degree part is executed (if the action part has no action degree part, the step is not executed, and the step 4.4 of the operation command part is executed), and if the action degree part corresponding to the action part in the voice command is identified, the step 4.5 of the action command in the operation command is executed;
and 4.4, finishing the voice recognition step by the voice recognition program in the step, and starting to execute the operation instruction from the step. Executing the operation instruction of the action part (corresponding to the voice instruction), and finishing the whole process after the action part in the step is executed because the action part in the step does not have a corresponding action degree part (the voice recognition program waits for the next voice instruction and restarts the next process);
step 4.5, in the step, the voice recognition program runs to execute the operation instruction, and executes the operation instruction of the action part in the received voice instruction, because the action part in the step has the corresponding action degree part, the step 4.6 of the action degree instruction is executed;
and 4.6, in the step, the voice recognition program runs to execute the operation instruction of the action degree part (corresponding to the action part), and after the action degree instruction of the step is executed, the whole process is ended (the voice recognition program waits for the next voice instruction and restarts the next process). Examples
The application takes an intelligent vehicle (voice instructions and operation instructions for controlling the intelligent vehicle to turn left or right) as an example to show that a voice recognition program in the voice recognition equipment is written in a listing-by-listing way,
table 1 below counts the number of entries of voice commands and operation commands in the voice recognition program written in a itemized method. In the process that the intelligent vehicle passes through the voice recognition command and the operation command and completes the left turn and the right turn, the voice recognition device (the intelligent vehicle) counts 14 pieces of voice recognition commands and 56 pieces of operation commands.
TABLE 1
Table 2 shows the speech recognition instructions of five speech recognition devices (smart car, electric light, electric fan, refrigerator, and television), by summarizing the characteristics of a large number of speech recognition instructions and operation instructions. Most (more than 95%) of the voice commands of a voice recognition device include three parts: a wake word part, an action part, and an action degree part (optional).
TABLE 2
As shown in table 3 below, the number of voice recognition instructions (from the previous 14) in total by the voice recognition device (smart car) is reduced to 10, and the number of operation instructions (from the previous 56) of the cumulative declaration is reduced to 18 (the numbers are derived from tables 1 and 3).
TABLE 3
The whole process of voice recognition instructions, operation instructions and voice recognition in the voice recognition program is efficiently normalized by a method (called a 'trisection method' for short) of dividing the voice recognition instructions and the operation instructions in the voice recognition program into three parts (a wakeup word part, an action part and an action degree part). By comparing the voice recognition programs of the intelligent vehicle (voice recognition equipment) in two applications (a left-turn/right-turn application and a forward/backward application) by two writing methods, the fact that writing the voice recognition program by adopting a 'trisection' method occupies fewer register addresses is found, and the requirement of the voice recognition equipment on a memory is reduced to a great extent.
Fig. 3 presents a comparison of the number of voice commands and the number of operating commands of the smart car in two writing methods in two applications (left turn/right turn application and forward/backward application) voice recognition programs. In a left/right turn application: the total number of voice recognition instructions (from 14 in a itemized listing method) in the voice recognition program written by adopting the trisection method is reduced to 10, and the number of operation instructions (from 56 in the itemized listing method) of accumulated statements is reduced to 18 (specific numbers are derived from the tables 1 and 3); in forward/reverse applications: the number of voice recognition instructions in total (from 16 in the itemized listing method) in the voice recognition program written in the "trisection method" is reduced to 10, and the number of operation instructions in the cumulative statement (from 64 in the itemized listing method) is reduced to 20 (the specific numbers are derived from tables 4 and 5).
Table 4 shows the number of speech recognition instructions and the number of operation instructions in the speech recognition program (in forward and backward applications) written by the smart car in a list-by-list manner;
TABLE 4
Table 5 below shows the number of speech recognition instructions and the number of operation instructions in the speech recognition program (in forward and backward applications) written by the smart car in the manner of "trisection";
TABLE 5
The method for standardizing the voice recognition instruction and the operation instruction of the voice recognition equipment has the main advantages that the shortcoming that the voice recognition program of the voice recognition equipment is too large and cannot be stored in a limited memory is overcome, the size of the voice recognition program is reduced to a certain extent, the development time of the voice recognition program and the hardware memory cost are greatly saved, and the method has high efficiency. Secondly, the voice instruction and the operation instruction in the voice recognition device can be efficiently added or deleted: especially under the condition that the memory of the voice recognition device is limited, the voice recognition program is compiled by the method, so that the requirements of voice commands and operation commands can be effectively increased, a large amount of memory can be saved, and the running efficiency of the voice recognition program is improved to a certain extent.
The method for writing the voice recognition command and the operation command in a segmented mode reduces occupation and requirements of a voice recognition device on a memory by dividing the voice recognition command and the operation command into three parts (a wakeup word part, an action part and an action degree part). However, there are many kinds of speech recognition devices, and this application cannot be cited one by one, and it is an alternative to this application if the speech recognition instructions and operation instructions in the speech recognition program are written in different speech recognition devices in the same manner as mentioned in this application. Some speech recognition devices are listed below:
(1) the application in different voice recognition device programs is applied to the programs of the intelligent household appliances mentioned in the application (such as intelligent cars, electric lamps, fans, refrigerators, televisions and the like);
(2) the application in different voice recognition device programs, the intelligent household appliance and other intelligent devices (with voice recognition function) which are not mentioned in the application (such as washing machine, intelligent sound equipment, intelligent automobile and the like) are applied.
Although the present application has been described above with reference to specific embodiments, those skilled in the art will recognize that many changes may be made in the configuration and details of the present application within the principles and scope of the present application. The scope of protection of the application is determined by the appended claims, and all changes that come within the meaning and range of equivalency of the technical features are intended to be embraced therein.