CN110705311B - Semantic understanding accuracy improving method, device and system applied to intelligent voice mouse and storage medium - Google Patents

Semantic understanding accuracy improving method, device and system applied to intelligent voice mouse and storage medium Download PDF

Info

Publication number
CN110705311B
CN110705311B CN201910923025.1A CN201910923025A CN110705311B CN 110705311 B CN110705311 B CN 110705311B CN 201910923025 A CN201910923025 A CN 201910923025A CN 110705311 B CN110705311 B CN 110705311B
Authority
CN
China
Prior art keywords
model
intelligent voice
semantic understanding
words
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910923025.1A
Other languages
Chinese (zh)
Other versions
CN110705311A (en
Inventor
冯海洪
毛德平
王康
朱国冉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Mimouse Technology Co ltd
Original Assignee
Anhui Mimouse Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Mimouse Technology Co ltd filed Critical Anhui Mimouse Technology Co ltd
Priority to CN201910923025.1A priority Critical patent/CN110705311B/en
Publication of CN110705311A publication Critical patent/CN110705311A/en
Application granted granted Critical
Publication of CN110705311B publication Critical patent/CN110705311B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0354Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of 2D relative movements between the device, or an operating part thereof, and a plane or surface, e.g. 2D mice, trackballs, pens or pucks
    • G06F3/03543Mice or pucks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Machine Translation (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention relates to the field of data processing, in particular to a semantic understanding accuracy improving method, a semantic understanding accuracy improving device, a semantic understanding accuracy improving system and a storage medium, wherein the method comprises the following steps: firstly, voice data input by a user is obtained through an intelligent voice mouse end, then the voice data is cleaned, irrelevant voice information such as noise and noise is removed, a word bag model is established, the model is fitted, trained, tested and optimized, and finally, the sentence is used as a word-oriented sequence by using a grammar and an end-to-end method by using the module.

Description

Semantic understanding accuracy improving method, device and system applied to intelligent voice mouse and storage medium
Technical Field
The invention relates to the field of data processing, in particular to a semantic understanding accuracy improving method, a semantic understanding accuracy improving device, a semantic understanding accuracy improving system and a storage medium applied to an intelligent voice mouse.
Background
The problem of NLU (Nature Language understanding) at present has two aspects: on the one hand, grammars so far are limited to analyzing an isolated sentence, and the constraints and influences of context and conversation environment on the sentence are lack of systematic research, so that the problems of analyzing ambiguity, word omission, word substitution, different meanings of the same sentence in different occasions or different meanings of the same sentence spoken by different people and the like are not clear and can be followed, the error rate is high, and the problems can be solved step by the research of strengthening pragmatics. On the other hand, people understand that a sentence is not just grammar, but also apply a great deal of knowledge, including life knowledge and expertise, which cannot be stored in a computer. Thus, a written understanding system may be built within a limited range of words, sentences and specific topics; the storage capacity and operating speed of computers have increased substantially before adequate expansion of the range is possible.
The above existing problems become a major problem in the machine translation application of natural language understanding, which is one of the reasons why the translation quality of the current machine translation system is still far from the ideal target; the quality of the translated text is the key to the success or failure of the machine translation system. The invention relates to a semantic understanding accuracy improving method applied to an intelligent voice mouse, which is used for improving the accuracy of natural language understanding by combining the intelligent voice mouse and application software of a computer.
Disclosure of Invention
Aiming at the existing problems, the invention aims to improve the accuracy of natural language semantic understanding by combining an intelligent voice mouse and application software of a computer end, and in order to solve the problems in the prior art, the invention provides a semantic understanding accuracy improving method applied to the intelligent voice mouse, which comprises the following steps:
step S1: acquiring voice data input by a user through an intelligent voice mouse end;
step S2: cleaning voice data, and removing irrelevant voice information such as noise and noise;
and step S3: inputting the numerical value into a machine learning model;
and step S4: classifying, namely decomposing the data into a training set for fitting the model and the test set to check the summarization degree of the invisible data;
step S5: checking, verifying the model, explaining the prediction of the model, and judging the word used by the model for making a decision;
step S6: lexical structure, using a TF-IDF score at the top of the model, allows the model to focus more on meaningful words.
Step S7: utilizing the semantics, capturing the semantics through Word2Vec assistance, and enabling the model to obtain high-signal words outside training;
step S8: the sentence is treated as a word-oriented sequence using the grammar using an end-to-end approach.
Preferably, the learning model in step S3 is a bag of words model, each sentence is represented as a list by building a vocabulary containing all words in the data set and creating a unique index for each word in the vocabulary, the length of the list depending on the number of different words, and in each index in the list, the number of times a given word appears in a sentence is marked.
Preferably, the end-to-end method in step S8 includes GloVe or CoVe.
In order to achieve the above object, the present invention further provides a semantic understanding accuracy improving apparatus applied to the intelligent voice mouse, comprising
The voice collection module acquires voice data input by a user through the intelligent voice mouse end;
the voice cleaning module is used for removing irrelevant voice information such as noise and noise;
the model establishing module is used for inputting the machine learning model in a numerical value form and establishing a bag-of-words model;
the model fitting module is used for fitting the established model so as to check the summarization degree of the model on invisible data;
the model testing and predicting module is used for testing and predicting the model and judging the word adopted by the model for making a decision;
a model training module, which uses TF-IDF score on the top of the model to make the model focus more on meaningful words;
the model optimization module is used for capturing semantics through Word2Vec assistance, so that the model obtains high-signal words except training;
the sentence is treated as a word-oriented sequence using the grammar using the modules using an end-to-end approach.
In order to achieve the above object, the present invention further provides a semantic understanding accuracy improving system applied to an intelligent voice mouse, including an intelligent voice mouse end, a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor implements the steps of the method when executing the computer program.
To achieve the above object, the present invention further provides a computer-readable storage medium, on which a computer program is stored, the processor implementing the steps of the above method when executing the computer program.
The invention has the beneficial effects that:
(1) The semantic understanding accuracy improving method applied to the intelligent voice mouse is combined with the intelligent voice mouse and application software of a computer end, and the accuracy of natural language semantic understanding is improved.
(2) The current latest natural language processing technology in the field of artificial intelligence is applied to accurately analyze the intention of the user.
Drawings
Fig. 1 is an overall flowchart of a semantic understanding accuracy improving method applied to an intelligent voice mouse according to embodiment 1 of the present invention.
Fig. 2 is a block diagram of a semantic understanding accuracy improving apparatus applied to an intelligent voice mouse according to embodiment 2 of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the present invention, and it is obvious that the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Example 1
Fig. 1 is an overall flowchart of a semantic understanding accuracy improvement method applied to an intelligent voice mouse according to an embodiment 1 of the present invention. As shown in fig. 1, a semantic understanding accuracy improving method applied to an intelligent voice mouse includes the following steps:
step S1: and acquiring voice data input by a user through the intelligent voice mouse end.
Step S2: and cleaning the voice data to remove irrelevant voice information such as noise and noise.
And step S3: the machine learning model is input in numerical form.
In this step, the learning model is a bag of words model, each sentence is represented as a list by building a vocabulary of all words in the dataset and creating a unique index for each word in the vocabulary, the length of the list depending on the number of different words, and in each index in the list, the number of times a given word appears in a sentence is marked.
And step S4: and classifying, namely decomposing the data into a training set for fitting the model and the test set to check the summarization degree of the invisible data.
Step S5: and checking, namely verifying the model, explaining the prediction of the model, and judging the word adopted by the model for making a decision.
Step S6: lexical structure, using a TF-IDF score (term frequency, inverse document frequency) at the top of the model, makes the model more focused on meaningful words.
Step S7: semantic is utilized, and the capturing of the semantic is assisted by Word2Vec, so that the model obtains high-signal words outside the training.
Step S8: sentences are treated as a word-oriented sequence using a grammar using an end-to-end approach.
In this step, the end-to-end method includes GloVe or CoVe.
Example 2
Fig. 2 is a block diagram of a semantic understanding accuracy improving apparatus applied to an intelligent voice mouse according to an embodiment 2 of the present invention. As shown in fig. 2, the embodiment provides a semantic understanding accuracy improving apparatus applied to an intelligent voice mouse, which includes
The voice collection module acquires voice data input by a user through the intelligent voice mouse end;
the voice cleaning module is used for removing irrelevant voice information such as noise and noise;
the model establishing module is used for inputting the machine learning model in a numerical value form and establishing a bag-of-words model;
the model fitting module is used for fitting the established model so as to check the summarization degree of the model on invisible data;
the model testing and predicting module is used for testing and predicting the model and judging the word adopted by the model for making a decision;
a model training module that uses a TF-IDF score (term frequency, inverse document frequency) on top of the model so that the model focuses more on meaningful words;
the model optimization module is used for capturing semantics through Word2Vec assistance, so that the model obtains high-signal words except training;
the sentence is treated as a word-oriented sequence using the grammar using the modules using an end-to-end approach.
Example 3
The embodiment provides a semantic understanding accuracy improving system applied to an intelligent voice mouse, which comprises an intelligent voice mouse end, a memory, a processor and a computer program, wherein the computer program is stored in the memory and can run on the processor, and the processor executes the computer program to realize the steps of the method.
Example 4
The present embodiment provides a computer-readable storage medium having stored thereon a computer program, which when executed by a processor implements the steps of the above-described method.
In summary, the semantic understanding accuracy improvement method, apparatus, system and storage medium applied to the intelligent voice mouse disclosed in the above embodiments of the present invention accurately analyze the intention of the user by using the current latest natural language processing technology in the artificial intelligence field, and improve the accuracy of the semantic understanding of the natural language by combining the intelligent voice mouse and the application software of the computer.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can understand that the changes or modifications within the technical scope of the present invention are included in the scope of the present invention, and therefore, the scope of the present invention should be subject to the protection scope of the claims.

Claims (6)

1. A semantic understanding accuracy improving method applied to an intelligent voice mouse is characterized by comprising the following steps:
step S1: acquiring voice data input by a user through an intelligent voice mouse end;
step S2: cleaning voice data to remove irrelevant voice information such as noise and noise;
and step S3: inputting the numerical value into a machine learning model;
and step S4: classifying, namely decomposing the data into a training set for fitting the model and the test set to check the summarization degree of the invisible data;
step S5: checking, verifying the model, explaining the prediction of the model, and judging the word used by the model for making a decision;
step S6: a lexical structure using a TF-IDF score at the top of the model so that the model focuses more on meaningful words;
step S7: utilizing the semantics, capturing the semantics through Word2Vec assistance, and enabling the model to obtain high-signal words outside training;
step S8: the sentence is treated as a word-oriented sequence using the grammar using an end-to-end approach.
2. The semantic understanding accuracy improving method applied to the intelligent voice mouse according to claim 1, characterized in that: the learning model in step S3 is a bag of words model, each sentence is represented as a list by building a vocabulary comprising all words in the dataset and creating a unique index for each word in the vocabulary, the length of the list depending on the number of different words, and in each index in the list, the number of times a given word appears in a sentence is marked.
3. The semantic understanding accuracy improving method applied to the intelligent voice mouse according to claim 1, characterized in that: the end-to-end method in step S8 includes GloVe or CoVe.
4. The utility model provides a be applied to semantic understanding accuracy hoisting device of intelligent voice mouse which characterized in that: comprises that
The voice collection module acquires voice data input by a user through the intelligent voice mouse end;
the voice cleaning module is used for removing irrelevant voice information such as noise and noise;
the model establishing module is used for inputting the machine learning model in a numerical value form and establishing a bag-of-words model;
the model fitting module is used for fitting the established model so as to check the summarization degree of the model on invisible data;
the model testing and predicting module is used for testing and predicting the model and judging the word adopted by the model for making a decision;
a model training module, which uses TF-IDF score on the top of the model to make the model focus more on meaningful words;
the model optimization module is used for capturing semantics through Word2Vec assistance, so that the model obtains high-signal words except training;
the sentence is treated as a word-oriented sequence using the grammar using the modules using an end-to-end approach.
5. The utility model provides a semantic understanding accuracy rate lift system for intelligent voice mouse, includes intelligent voice mouse end, memory, treater and store in the memory and can be at the computer program of treater operation which characterized in that: the processor when executing the computer program realizes the steps of the method of any of the preceding claims 1 to 3.
6. A computer-readable storage medium having stored thereon a computer program, characterized in that: the program when executed by a processor implements the steps of the method of any of claims 1 to 3.
CN201910923025.1A 2019-09-27 2019-09-27 Semantic understanding accuracy improving method, device and system applied to intelligent voice mouse and storage medium Active CN110705311B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910923025.1A CN110705311B (en) 2019-09-27 2019-09-27 Semantic understanding accuracy improving method, device and system applied to intelligent voice mouse and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910923025.1A CN110705311B (en) 2019-09-27 2019-09-27 Semantic understanding accuracy improving method, device and system applied to intelligent voice mouse and storage medium

Publications (2)

Publication Number Publication Date
CN110705311A CN110705311A (en) 2020-01-17
CN110705311B true CN110705311B (en) 2022-11-25

Family

ID=69198226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910923025.1A Active CN110705311B (en) 2019-09-27 2019-09-27 Semantic understanding accuracy improving method, device and system applied to intelligent voice mouse and storage medium

Country Status (1)

Country Link
CN (1) CN110705311B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002086864A1 (en) * 2001-04-18 2002-10-31 Rutgers, The State University Of New Jersey System and method for adaptive language understanding by computers
CN109948163A (en) * 2019-03-25 2019-06-28 中国科学技术大学 The natural language semantic matching method that sequence dynamic is read
CN110209791A (en) * 2019-06-12 2019-09-06 百融云创科技股份有限公司 It is a kind of to take turns dialogue intelligent speech interactive system and device more

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6937983B2 (en) * 2000-12-20 2005-08-30 International Business Machines Corporation Method and system for semantic speech recognition

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002086864A1 (en) * 2001-04-18 2002-10-31 Rutgers, The State University Of New Jersey System and method for adaptive language understanding by computers
CN109948163A (en) * 2019-03-25 2019-06-28 中国科学技术大学 The natural language semantic matching method that sequence dynamic is read
CN110209791A (en) * 2019-06-12 2019-09-06 百融云创科技股份有限公司 It is a kind of to take turns dialogue intelligent speech interactive system and device more

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于语用的自然语言处理研究与应用初探;李蕾等;《智能系统学报》;20061015(第02期);全文 *

Also Published As

Publication number Publication date
CN110705311A (en) 2020-01-17

Similar Documents

Publication Publication Date Title
KR102117160B1 (en) A text processing method and device based on ambiguous entity words
US10831762B2 (en) Extracting and denoising concept mentions using distributed representations of concepts
US9805718B2 (en) Clarifying natural language input using targeted questions
KR102316063B1 (en) Method and apparatus for identifying key phrase in audio data, device and medium
CN104143329B (en) Carry out method and the device of voice keyword retrieval
US10169703B2 (en) System and method for analogy detection and analysis in a natural language question and answering system
CN108710704B (en) Method and device for determining conversation state, electronic equipment and storage medium
US10157174B2 (en) Utilizing a dialectical model in a question answering system
WO2022188584A1 (en) Similar sentence generation method and apparatus based on pre-trained language model
CN112037773B (en) N-optimal spoken language semantic recognition method and device and electronic equipment
CN112101045B (en) Multi-mode semantic integrity recognition method and device and electronic equipment
JP6526470B2 (en) Pre-construction method of vocabulary semantic patterns for text analysis and response system
CN111651998B (en) Weak supervision deep learning semantic analysis method under virtual reality and augmented reality scenes
CN110991175B (en) Method, system, equipment and storage medium for generating text in multi-mode
CN112580339B (en) Model training method and device, electronic equipment and storage medium
CN110196963A (en) Model generation, the method for semantics recognition, system, equipment and storage medium
CN114096960A (en) Natural language response for machine-assisted agents
CN114330371A (en) Session intention identification method and device based on prompt learning and electronic equipment
CN106708950B (en) Data processing method and device for intelligent robot self-learning system
CN115345177A (en) Intention recognition model training method and dialogue method and device
CN112257432A (en) Self-adaptive intention identification method and device and electronic equipment
CN110705311B (en) Semantic understanding accuracy improving method, device and system applied to intelligent voice mouse and storage medium
CN116821290A (en) Multitasking dialogue-oriented large language model training method and interaction method
CN115169370A (en) Corpus data enhancement method and device, computer equipment and medium
CN112015921B (en) Natural language processing method based on learning auxiliary knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant