CN110705311B

CN110705311B - Semantic understanding accuracy improving method, device and system applied to intelligent voice mouse and storage medium

Info

Publication number: CN110705311B
Application number: CN201910923025.1A
Authority: CN
Inventors: 冯海洪; 毛德平; 王康; 朱国冉
Original assignee: Anhui Mimouse Technology Co ltd
Current assignee: Anhui Mimouse Technology Co ltd
Priority date: 2019-09-27
Filing date: 2019-09-27
Publication date: 2022-11-25
Anticipated expiration: 2039-09-27
Also published as: CN110705311A

Abstract

The invention relates to the field of data processing, in particular to a semantic understanding accuracy improving method, a semantic understanding accuracy improving device, a semantic understanding accuracy improving system and a storage medium, wherein the method comprises the following steps: firstly, voice data input by a user is obtained through an intelligent voice mouse end, then the voice data is cleaned, irrelevant voice information such as noise and noise is removed, a word bag model is established, the model is fitted, trained, tested and optimized, and finally, the sentence is used as a word-oriented sequence by using a grammar and an end-to-end method by using the module.

Description

Semantic understanding accuracy improving method, device and system applied to intelligent voice mouse and storage medium

Technical Field

The invention relates to the field of data processing, in particular to a semantic understanding accuracy improving method, a semantic understanding accuracy improving device, a semantic understanding accuracy improving system and a storage medium applied to an intelligent voice mouse.

Background

The problem of NLU (Nature Language understanding) at present has two aspects: on the one hand, grammars so far are limited to analyzing an isolated sentence, and the constraints and influences of context and conversation environment on the sentence are lack of systematic research, so that the problems of analyzing ambiguity, word omission, word substitution, different meanings of the same sentence in different occasions or different meanings of the same sentence spoken by different people and the like are not clear and can be followed, the error rate is high, and the problems can be solved step by the research of strengthening pragmatics. On the other hand, people understand that a sentence is not just grammar, but also apply a great deal of knowledge, including life knowledge and expertise, which cannot be stored in a computer. Thus, a written understanding system may be built within a limited range of words, sentences and specific topics; the storage capacity and operating speed of computers have increased substantially before adequate expansion of the range is possible.

The above existing problems become a major problem in the machine translation application of natural language understanding, which is one of the reasons why the translation quality of the current machine translation system is still far from the ideal target; the quality of the translated text is the key to the success or failure of the machine translation system. The invention relates to a semantic understanding accuracy improving method applied to an intelligent voice mouse, which is used for improving the accuracy of natural language understanding by combining the intelligent voice mouse and application software of a computer.

Disclosure of Invention

Aiming at the existing problems, the invention aims to improve the accuracy of natural language semantic understanding by combining an intelligent voice mouse and application software of a computer end, and in order to solve the problems in the prior art, the invention provides a semantic understanding accuracy improving method applied to the intelligent voice mouse, which comprises the following steps:

step S1: acquiring voice data input by a user through an intelligent voice mouse end;

step S2: cleaning voice data, and removing irrelevant voice information such as noise and noise;

and step S3: inputting the numerical value into a machine learning model;

and step S4: classifying, namely decomposing the data into a training set for fitting the model and the test set to check the summarization degree of the invisible data;

step S5: checking, verifying the model, explaining the prediction of the model, and judging the word used by the model for making a decision;

step S6: lexical structure, using a TF-IDF score at the top of the model, allows the model to focus more on meaningful words.

Step S7: utilizing the semantics, capturing the semantics through Word2Vec assistance, and enabling the model to obtain high-signal words outside training;

step S8: the sentence is treated as a word-oriented sequence using the grammar using an end-to-end approach.

Preferably, the learning model in step S3 is a bag of words model, each sentence is represented as a list by building a vocabulary containing all words in the data set and creating a unique index for each word in the vocabulary, the length of the list depending on the number of different words, and in each index in the list, the number of times a given word appears in a sentence is marked.

Preferably, the end-to-end method in step S8 includes GloVe or CoVe.

In order to achieve the above object, the present invention further provides a semantic understanding accuracy improving apparatus applied to the intelligent voice mouse, comprising

The voice collection module acquires voice data input by a user through the intelligent voice mouse end;

the voice cleaning module is used for removing irrelevant voice information such as noise and noise;

the model establishing module is used for inputting the machine learning model in a numerical value form and establishing a bag-of-words model;

the model fitting module is used for fitting the established model so as to check the summarization degree of the model on invisible data;

the model testing and predicting module is used for testing and predicting the model and judging the word adopted by the model for making a decision;

a model training module, which uses TF-IDF score on the top of the model to make the model focus more on meaningful words;

the model optimization module is used for capturing semantics through Word2Vec assistance, so that the model obtains high-signal words except training;

the sentence is treated as a word-oriented sequence using the grammar using the modules using an end-to-end approach.

In order to achieve the above object, the present invention further provides a semantic understanding accuracy improving system applied to an intelligent voice mouse, including an intelligent voice mouse end, a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor implements the steps of the method when executing the computer program.

To achieve the above object, the present invention further provides a computer-readable storage medium, on which a computer program is stored, the processor implementing the steps of the above method when executing the computer program.

The invention has the beneficial effects that:

(1) The semantic understanding accuracy improving method applied to the intelligent voice mouse is combined with the intelligent voice mouse and application software of a computer end, and the accuracy of natural language semantic understanding is improved.

(2) The current latest natural language processing technology in the field of artificial intelligence is applied to accurately analyze the intention of the user.

Drawings

Fig. 1 is an overall flowchart of a semantic understanding accuracy improving method applied to an intelligent voice mouse according to embodiment 1 of the present invention.

Fig. 2 is a block diagram of a semantic understanding accuracy improving apparatus applied to an intelligent voice mouse according to embodiment 2 of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the present invention, and it is obvious that the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

Example 1

Fig. 1 is an overall flowchart of a semantic understanding accuracy improvement method applied to an intelligent voice mouse according to an embodiment 1 of the present invention. As shown in fig. 1, a semantic understanding accuracy improving method applied to an intelligent voice mouse includes the following steps:

step S1: and acquiring voice data input by a user through the intelligent voice mouse end.

Step S2: and cleaning the voice data to remove irrelevant voice information such as noise and noise.

And step S3: the machine learning model is input in numerical form.

In this step, the learning model is a bag of words model, each sentence is represented as a list by building a vocabulary of all words in the dataset and creating a unique index for each word in the vocabulary, the length of the list depending on the number of different words, and in each index in the list, the number of times a given word appears in a sentence is marked.

And step S4: and classifying, namely decomposing the data into a training set for fitting the model and the test set to check the summarization degree of the invisible data.

Step S5: and checking, namely verifying the model, explaining the prediction of the model, and judging the word adopted by the model for making a decision.

Step S6: lexical structure, using a TF-IDF score (term frequency, inverse document frequency) at the top of the model, makes the model more focused on meaningful words.

Step S7: semantic is utilized, and the capturing of the semantic is assisted by Word2Vec, so that the model obtains high-signal words outside the training.

Step S8: sentences are treated as a word-oriented sequence using a grammar using an end-to-end approach.

In this step, the end-to-end method includes GloVe or CoVe.

Example 2

Fig. 2 is a block diagram of a semantic understanding accuracy improving apparatus applied to an intelligent voice mouse according to an embodiment 2 of the present invention. As shown in fig. 2, the embodiment provides a semantic understanding accuracy improving apparatus applied to an intelligent voice mouse, which includes

a model training module that uses a TF-IDF score (term frequency, inverse document frequency) on top of the model so that the model focuses more on meaningful words;

Example 3

The embodiment provides a semantic understanding accuracy improving system applied to an intelligent voice mouse, which comprises an intelligent voice mouse end, a memory, a processor and a computer program, wherein the computer program is stored in the memory and can run on the processor, and the processor executes the computer program to realize the steps of the method.

Example 4

The present embodiment provides a computer-readable storage medium having stored thereon a computer program, which when executed by a processor implements the steps of the above-described method.

In summary, the semantic understanding accuracy improvement method, apparatus, system and storage medium applied to the intelligent voice mouse disclosed in the above embodiments of the present invention accurately analyze the intention of the user by using the current latest natural language processing technology in the artificial intelligence field, and improve the accuracy of the semantic understanding of the natural language by combining the intelligent voice mouse and the application software of the computer.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can understand that the changes or modifications within the technical scope of the present invention are included in the scope of the present invention, and therefore, the scope of the present invention should be subject to the protection scope of the claims.

Claims

1. A semantic understanding accuracy improving method applied to an intelligent voice mouse is characterized by comprising the following steps:

step S2: cleaning voice data to remove irrelevant voice information such as noise and noise;

and step S3: inputting the numerical value into a machine learning model;

step S6: a lexical structure using a TF-IDF score at the top of the model so that the model focuses more on meaningful words;

2. The semantic understanding accuracy improving method applied to the intelligent voice mouse according to claim 1, characterized in that: the learning model in step S3 is a bag of words model, each sentence is represented as a list by building a vocabulary comprising all words in the dataset and creating a unique index for each word in the vocabulary, the length of the list depending on the number of different words, and in each index in the list, the number of times a given word appears in a sentence is marked.

3. The semantic understanding accuracy improving method applied to the intelligent voice mouse according to claim 1, characterized in that: the end-to-end method in step S8 includes GloVe or CoVe.

4. The utility model provides a be applied to semantic understanding accuracy hoisting device of intelligent voice mouse which characterized in that: comprises that

5. The utility model provides a semantic understanding accuracy rate lift system for intelligent voice mouse, includes intelligent voice mouse end, memory, treater and store in the memory and can be at the computer program of treater operation which characterized in that: the processor when executing the computer program realizes the steps of the method of any of the preceding claims 1 to 3.

6. A computer-readable storage medium having stored thereon a computer program, characterized in that: the program when executed by a processor implements the steps of the method of any of claims 1 to 3.