CN114218356B - Semantic recognition method, device, equipment and storage medium based on artificial intelligence - Google Patents
Semantic recognition method, device, equipment and storage medium based on artificial intelligence Download PDFInfo
- Publication number
- CN114218356B CN114218356B CN202111537450.0A CN202111537450A CN114218356B CN 114218356 B CN114218356 B CN 114218356B CN 202111537450 A CN202111537450 A CN 202111537450A CN 114218356 B CN114218356 B CN 114218356B
- Authority
- CN
- China
- Prior art keywords
- score
- translation
- data
- result
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 41
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000013519 translation Methods 0.000 claims abstract description 167
- 230000004044 response Effects 0.000 claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 15
- 238000013210 evaluation model Methods 0.000 claims abstract description 10
- 238000006243 chemical reaction Methods 0.000 claims abstract description 8
- 230000015654 memory Effects 0.000 claims description 27
- 238000013528 artificial neural network Methods 0.000 claims description 21
- 238000004422 calculation algorithm Methods 0.000 claims description 17
- 238000012549 training Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 7
- 238000004458 analytical method Methods 0.000 claims description 6
- 238000012163 sequencing technique Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 8
- 230000006854 communication Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000003909 pattern recognition Methods 0.000 description 3
- 239000002131 composite material Substances 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013075 data extraction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to artificial intelligence and discloses a semantic recognition method, a semantic recognition device, semantic recognition equipment and a semantic recognition medium based on artificial intelligence, wherein the semantic recognition method comprises the following steps: receiving voice response data sent by a client, performing voice conversion text processing on the voice response data to obtain at least two translation text data, performing translation scoring on the translation text data through a preset translation evaluation model, obtaining k translation text data with the highest translation score, and recalling n candidate results associated with the target data from a database as target data, analyzing text semantic matching degree between the translation text and the candidate results through a twin network model for each target data to obtain a similarity score, arranging all prediction results according to the comprehensive scores of the translation score and the similarity score from high to low, returning the prediction result with the highest score as the target result, obtaining translation text data corresponding to the target result, and taking the translation text data as a recognition result.
Description
Technical Field
The present invention relates to the field of natural language processing, and in particular, to a semantic recognition method, apparatus, device and storage medium based on artificial intelligence.
Background
Natural Language Understanding (NLU) is one of the important areas of artificial intelligence, which is a light thing for humans, but a very challenging thing for artificial intelligence. The most common way for a person is to use spoken language to express his own ideas and ideas, and by performing data extraction analysis on the spoken language, the semantics of the spoken language can be identified. The method can quickly and accurately identify the demands of the clients, which has great application in the intelligent customer service industry, and can help enterprises to improve the service quality and the client satisfaction.
In a conventional intelligent semantic recognition system, a user's answer outputs a text translation result through A Speech Recognition (ASR) module, and then matches a database through BM25 or regularization or the like, thereby recognizing the user's intention. But single text translation results and simple data matching approaches tend to make it difficult to correctly identify the intent of the user's answer in the face of more complex spoken sentences.
Disclosure of Invention
The embodiment of the invention provides a semantic recognition method, a semantic recognition device, computer equipment and a storage medium based on artificial intelligence so as to improve the accuracy of semantic recognition.
A semantic recognition method based on artificial intelligence, comprising:
Receiving voice response data sent by a client;
performing voice conversion text processing on the voice response data by adopting a voice recognition algorithm to obtain at least two translation text data;
performing translation scoring on the translation text data through a preset translation evaluation model, and acquiring k translation text data with highest translation scoring as target data, wherein k is a positive integer;
Recall n candidate results associated with the target data from a database according to a set recall strategy for each target data, wherein n is a positive integer, and each target data corresponds to a plurality of candidate results in the database;
transmitting each target data and the recalled candidate result into a twin network model, and analyzing text semantic matching degree between the translation text and the candidate result through the twin network model to obtain a similarity score;
And determining a comprehensive score by combining the translation score and the similarity score, arranging all the predicted results according to the comprehensive score from high to low, returning the predicted result with the highest score as a target result, and acquiring translation text data corresponding to the target result as a recognition result.
Optionally, recalling n candidate results associated with the target data from a database by the set recall policy includes:
Carrying out morpheme analysis on each candidate result corresponding to the target data in the database to obtain a plurality of basic morphemes, and taking the basic morphemes corresponding to the same candidate result as a group of basic morphemes;
calculating a relevance score of each base morpheme to the target data for each set of base morphemes;
weighting and summing the relevance scores to obtain a relevance score of the group of basic morphemes and the target data;
and sequencing all the relevance scores according to the sequence from big to small, and selecting n candidate results from front to back as recall candidate results.
Optionally, the weighting and summing the relevance scores to obtain a relevance score of the set of base morphemes and the target data includes:
The relevance score of the set of base morphemes to the target data is calculated using the following formula:
Wherein Score (Q, d) is a correlation Score between the set of basic morphemes and the target data, Q represents candidate results, Q i represents one morpheme after each candidate result is analyzed, and d is the target data; w represents the weight of morpheme q i; r (q i, d) represents the correlation score of the morpheme q i with the target data d.
Optionally, the sorting all the relevance scores according to the order from big to small, and selecting n candidate results from front to back, as recalled candidate results, includes:
Acquiring a preset correlation score threshold;
Comparing the correlation scores corresponding to the n recalled candidate results with the preset correlation score threshold value to obtain a comparison result;
And if the correlation score corresponding to the recalled candidate result in the comparison result is smaller than the preset correlation score threshold, taking the candidate result as an invalid candidate result, and eliminating the invalid candidate result from the recalled candidate result.
Optionally, the obtaining k translation text data with the highest translation scores includes, as target data:
Based on a minimum heap Top-k algorithm, arbitrarily selecting translation scores of k translation text data from all translation text data, and establishing a minimum heap, wherein the minimum heap comprises a heap Top, and the heap Top is the minimum score in the translation scores of the k translation text data, and unselected translation text data is taken as residual translation data;
Selecting a translation score of any one of the remaining translation data as a comparison score, and comparing the comparison score with the score of the top of stack until the remaining translation data is selected;
If the comparison score is not greater than the score of the top of the pile, returning to select any score of the residual translation data as the comparison score to continue execution;
And if the comparison score is larger than the score of the top of the pile, taking the comparison score as the score of the new top of the pile.
Optionally, the analyzing text semantic matching degree between the translated text and the candidate result through the twin network model, and obtaining a similarity score includes:
coding the translation text and the candidate result by adopting a long-short memory neural network pair respectively to obtain a first code and a second code;
measuring the spatial similarity of the first code and the second code by using Manhattan distance;
the similarity score is determined based on the spatial similarity.
Optionally, after obtaining the translated text data corresponding to the target result as the recognition result, the method further includes:
taking the translated text data and the recognition result as new annotation data;
and training the twin network model by adopting the new annotation data to obtain an updated twin network model.
A semantic recognition device based on artificial intelligence, comprising:
the data receiving module is used for receiving voice response data sent by the client;
The text translation module is used for carrying out voice conversion text processing on the voice response data by adopting a voice recognition algorithm to obtain at least two translation text data;
the data scoring module is used for scoring the translation text data through a preset translation evaluation model, and obtaining k translation text data with the highest translation score as target data, wherein k is a positive integer;
the associated recall module is used for recalling n candidate results associated with the target data from a database through a set recall strategy according to each target data, wherein n is a positive integer, and each target data corresponds to a plurality of candidate results in the database;
The semantic matching module is used for transmitting each target data and the recalled candidate result into a twin network model, and analyzing text semantic matching degree between the translation text and the candidate result through the twin network model to obtain a similarity score;
and the result determining module is used for determining a comprehensive score by combining the translation score and the similarity score, arranging all the predicted results according to the comprehensive score from high to low, returning the predicted result with the highest score as a target result, and acquiring translation text data corresponding to the target result as a recognition result.
Optionally, the association recall module includes:
The morpheme analyzing unit is used for carrying out morpheme analysis on each candidate result corresponding to the target data in the database to obtain a plurality of basic morphemes, and the basic morphemes corresponding to the same candidate result are used as a group of basic morphemes;
A first correlation calculation unit configured to calculate, for each group of the base morphemes, a correlation score of each of the base morphemes with the target data;
The second correlation calculation unit is used for carrying out weighted summation on the correlation scores to obtain correlation scores of the group of basic morphemes and the target data;
And the candidate result determining unit is used for sequencing all the relevance scores according to the sequence from big to small, and selecting n candidate results from front to back as recalled candidate results.
Optionally, the second correlation calculation unit includes:
a calculating subunit, configured to calculate a relevance score of the set of basic morphemes and the target data using the following formula:
Wherein Score (Q, d) is a correlation Score between the set of basic morphemes and the target data, Q represents candidate results, Q i represents one morpheme after each candidate result is analyzed, and d is the target data; w represents the weight of morpheme q i; r (q i, d) represents the correlation score of the morpheme q i with the target data d.
Optionally, the candidate result determining unit includes:
The threshold value acquisition subunit is used for acquiring a preset correlation score threshold value;
the comparison subunit is used for comparing the correlation scores corresponding to the n recalled candidate results with the preset correlation score threshold value to obtain comparison results;
and the candidate result updating subunit is used for taking the candidate result as an invalid candidate result if the correlation score corresponding to the recalled candidate result in the comparison result is smaller than the preset correlation score threshold value, and eliminating the invalid candidate result from the recalled candidate result.
Optionally, the data scoring module includes:
A minimum heap construction unit, configured to arbitrarily select, based on a minimum heap Top-k algorithm, translation scores of k translation text data from all translation text data, and establish a minimum heap, where the minimum heap includes a heap Top, which is a minimum score among the translation scores of k translation text data, and unselected translation text data is used as remaining translation data;
A comparison unit, configured to select a translation score of any one of the remaining translation data as a comparison score, and compare the comparison score with the score of the top of stack until the remaining translation data is selected;
The first execution unit is used for returning to select the score of any residual translation data as the comparison score to continue execution if the comparison score is not greater than the score of the top of the pile;
And the second execution unit is used for taking the comparison score as a new top score if the comparison score is larger than the top score.
Optionally, the semantic matching module includes:
The coding unit is used for coding the translation text and the candidate result by adopting the long-short-time memory neural network pairs respectively to obtain a first code and a second code;
A similarity calculation unit for measuring the spatial similarity of the first code and the second code by using Manhattan distance;
and a score determining unit configured to determine the similarity score based on the spatial similarity.
Optionally, the apparatus comprises:
The annotation data determining module is used for taking the translation text data and the recognition result as new annotation data;
and the updating training module is used for training the twin network model by adopting the new annotation data to obtain an updated twin network model.
A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the artificial intelligence based semantic recognition method described above when the computer program is executed.
A computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the artificial intelligence based semantic recognition method described above.
According to the semantic recognition method, device, computer equipment and storage medium based on artificial intelligence, on one hand, voice response data sent by a client are received, voice response data are subjected to voice conversion text processing by adopting a voice recognition algorithm to obtain at least two pieces of translation text data, then translation grading is carried out on the translation text data through a preset translation evaluation model, k translation text data with highest translation grading are obtained and serve as target data, recognition errors caused by single translation results serving as semantic translation results are avoided, recognition accuracy is improved, on the other hand, for each target data, n candidate results related to the target data are recalled from a database through a set recall strategy, each target data and recalled candidate results are transmitted into a twin network model, text semantic matching degree between the translation text and the candidate results is analyzed through the twin network model to obtain a similarity grading, comprehensive grading is determined according to the comprehensive grading, all prediction results with highest grading are arranged from high to low, the prediction results with highest grading are returned to serve as target results, and the corresponding target results are obtained, the target results are taken as target results, recognition results are accurately matched, and the recognition of the target data is achieved, and the recognition accuracy is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an application of an artificial intelligence based semantic recognition method provided by an embodiment of the present invention;
FIG. 2 is a flowchart of an implementation of an artificial intelligence based semantic recognition method provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of an artificial intelligence based semantic recognition device according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Wherein artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Referring to fig. 1, as shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Eperts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Eperts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (ContentDelivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
It should be noted that, the semantic recognition method based on artificial intelligence provided by the embodiment of the application is executed by the server, and correspondingly, the semantic recognition based on artificial intelligence is set in the server.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. Any number of terminal devices, networks and servers may be provided according to implementation requirements, and the terminal devices 101, 102, 103 in the embodiment of the present application may specifically correspond to application systems in actual production.
Referring to fig. 2, fig. 2 shows a semantic recognition method based on artificial intelligence according to an embodiment of the present invention, which is described in detail as follows:
s201: and receiving voice response data sent by the client.
Specifically, voice response data of the client in the voice consultation process is received through a network transmission protocol, wherein the voice response data refers to voice data sent by the client to the server in the communication process of the client and the server.
S202: and performing voice conversion text processing on the voice response data by adopting a voice recognition algorithm to obtain at least two pieces of translation text data.
The voice recognition algorithm (Automatic Speech Recognition) is an algorithm model which takes voice as a research object and enables a machine to automatically recognize and understand the language spoken by human through voice signal processing and pattern recognition. The speech recognition technology enables a machine to convert a speech signal into a corresponding text or command through a recognition and understanding process, is a cross subject with a very wide range of relations with subjects such as acoustics, phonetics, linguistics, information theory, pattern recognition theory and neurobiology, and the like, and is essentially pattern recognition, and comprises three basic units such as feature extraction, pattern matching and reference pattern library.
It should be noted that, the speech recognition algorithm in this embodiment includes a plurality of different translation models, and at least two different translation text data are obtained based on the different translation models.
Particularly translation models include, but are not limited to, hidden Markov models (Hidden Markov Model, HMM), linear Predictive Cepstral Coefficients (LPCC), and Mel cepstral coefficients (MFCC), among others.
S203: and carrying out translation scoring on the translation text data through a preset translation evaluation model, and acquiring k translation text data with the highest translation scoring as target data, wherein k is a positive integer.
The preset translation evaluation model refers to a preset rationality rule through grammar normalization and word sense collocation, and a model for scoring translation texts can be set according to actual scene requirements, and the corresponding grammar normalization and word sense collocation rule is not particularly limited herein.
In an alternative embodiment, in step S203, k pieces of translation text data with the highest translation scores are obtained as the target data, including:
Based on a minimum heap Top-k algorithm, randomly selecting translation scores of k translation text data from all translation text data, and establishing a minimum heap, wherein the minimum heap comprises a heap Top, and the heap Top is the minimum score in the translation scores of the k translation text data, and unselected translation text data is taken as residual translation data;
selecting any translation score of the residual translation data as a comparison score, and comparing the comparison score with the score of the top of the stack until the residual translation data is selected;
if the comparison score is not greater than the score of the top of the pile, returning to select the score of any residual translation data as the comparison score to continue execution;
If the comparison score is greater than the score of the top of the pile, the comparison score is used as the score of the new top of the pile.
S204: and recalling n candidate results associated with the target data from the database according to a set recall strategy aiming at each target data, wherein n is a positive integer, and each target data corresponds to a plurality of candidate results in the database.
Specifically, each target data corresponds to a plurality of candidate results in the database, the candidate results are standard sentences which are stored in the database and are the same as the target data context, and the mode of acquiring n candidate results is described in the following embodiments.
In an alternative embodiment, in step S204, recalling n candidate results associated with the target data from the database by the set recall policy includes:
carrying out morpheme analysis on each candidate result corresponding to the target data in the database to obtain a plurality of basic morphemes, and taking the basic morphemes corresponding to the same candidate result as a group of basic morphemes;
for each group of basic morphemes, calculating a correlation score of each basic morpheme and target data;
weighting and summing the correlation scores to obtain a correlation score of the group of basic morphemes and the target data;
and sequencing all the relevance scores according to the sequence from big to small, and selecting n candidate results from front to back as recall candidate results.
The specific weighting method may be to set a dynamically generated weighting condition in advance according to the occurrence frequency, the part of speech, etc. of each morpheme, and further to perform weighting processing on the morpheme according to the weight, which is not particularly limited herein.
In an alternative embodiment, weighting and summing the relevance scores to obtain a relevance score for the set of base morphemes and the target data includes:
The relevance score of the set of base morphemes to the target data is calculated using the following formula:
Wherein Score (Q, d) is a correlation Score between the set of basic morphemes and the target data, Q represents candidate results, Q i represents one morpheme after each candidate result is analyzed, and d is the target data; w represents the weight of morpheme q i; r (q i, d) represents the correlation score of the morpheme q i with the target data d.
In an alternative embodiment, all the relevance scores are ranked in order of from big to small, and n candidate results are selected from front to back as recalled candidate results, including:
Acquiring a preset correlation score threshold;
comparing the relevance scores corresponding to the n recalled candidate results with a preset relevance score threshold value to obtain a comparison result;
If the correlation score corresponding to the recalled candidate result in the comparison result is smaller than the preset correlation score threshold, the candidate result is taken as an invalid candidate result, and the invalid candidate result is removed from the recalled candidate result.
The preset relevance score threshold may be set according to an actual application scenario, which is not limited herein.
S205: and transmitting each target data and the recalled candidate result into a twin network model, and analyzing the text semantic matching degree between the translation text and the candidate result through the twin network model to obtain a similarity score.
Specifically, a twin neural network (Siamese neural network), also known as a twin neural network, is a coupling framework established based on two artificial neural networks. The twin neural network takes two samples as input and outputs the characterization of the twin neural network embedded in a high-dimensional space so as to compare the similarity degree of the two samples. The narrow-definition twin neural network is formed by splicing two neural networks which have the same structure and share weight. The generalized twin neural network, or "pseudo twin neural network (pseudo-siamese network)", may be formed by splicing any two neural networks. Twin neural networks typically have a deep structure and may consist of convolutional neural networks, recurrent neural networks, and the like. In this embodiment, the target data and the recalled candidate result are used as input of the twin network model, and matching calculation is performed through the twin network model, so as to obtain similarity scores of the target data and the recalled candidate result.
In an alternative embodiment, in step S205, analyzing, by the twin network model, the text semantic matching degree between the translated text and the candidate result, and obtaining the similarity score includes:
coding the translation text and the candidate result by adopting a long-short memory neural network pair respectively to obtain a first code and a second code;
Measuring the spatial similarity of the first code and the second code by using Manhattan distance;
a similarity score is determined based on the spatial similarity.
The Long Short-Term Memory network (LSTM) is a time-circulating neural network, and is specifically designed to solve the Long-Term dependency problem of a general RNN (circulating neural network), where all RNNs have a chain form of repeating neural network modules. In a standard RNN, this repeated structural module has only a very simple structure, such as a tanh layer.
In the application, considering that each vocabulary in the sentence is different in different contexts, the long-term memory network can be used for weighting in combination with the upper and lower Wen Duiyu meanings, which is beneficial to improving the accuracy of the obtained codes and further improving the accuracy of the subsequent similarity score calculation.
S206: and determining a comprehensive score by combining the translation score and the similarity score, arranging all the predicted results according to the comprehensive score from high to low, returning the predicted result with the highest score as a target result, and acquiring translation text data corresponding to the target result as a recognition result.
Specifically, the translation scores and the similarity scores are integrated to obtain integrated scores, the integrated scores are further sequenced according to the sequence from high to low, the predicted results corresponding to the integrated scores are sequenced based on the sequenced sequences, the predicted result with the highest integrated score is selected as a target result, and translation text data corresponding to the target result is obtained and is used as a final recognition result.
In an optional embodiment, after step S206, that is, after obtaining the translated text data corresponding to the target result, as the recognition result, the method further includes:
Taking the translated text data and the recognition result as new annotation data;
and training the twin network model by adopting new annotation data to obtain an updated twin network model.
The translation text in the voice recognition process is recorded to serve as new annotation data so as to optimize the accuracy of a voice recognition algorithm. In addition, the historical prediction result of the twin network model is recorded, new training data is constructed after labeling, so that the text matching capacity of the model is optimized, and the accuracy of semantic recognition is improved.
In this embodiment, on the one hand, voice response data sent by a client is received, voice recognition algorithm is adopted to perform voice conversion text processing on the voice response data, at least two pieces of translation text data are obtained, further translation evaluation model is preset to perform translation scoring on the translation text data, k translation text data with highest translation scores are obtained, the k translation text data with highest translation scores are used as target data, recognition errors caused by a single translation result serving as a semantic translation result are avoided, recognition accuracy is improved, on the other hand, n candidate results associated with the target data are recalled from a database by means of a set recall strategy for each target data, each target data and the recalled candidate results are transmitted into a twin network model, text semantic matching degree between the translation text and the candidate results is analyzed by means of the twin network model, a similarity score is obtained, comprehensive score is determined by combining the translation score, all prediction results are arranged according to the comprehensive score from high to low, the highest prediction result is returned as the target result, the translation text data corresponding to the target result is obtained, the target result is used as the recognition result, the candidate results associated with the target data in the database are recalled, the target data is obtained, the semantic matching accuracy is achieved, and the semantic recognition accuracy is improved after the target data is predicted, and the semantic matching is achieved.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
In one embodiment, an artificial intelligence based semantic recognition device is provided, where the artificial intelligence based semantic recognition device corresponds to the artificial intelligence based semantic recognition method in the above embodiment one by one. As shown in fig. 3, fig. 3 is a schematic diagram of the semantic recognition device based on artificial intelligence, including: a data receiving module 31, a text translation module 32, a data scoring module 33, an association recall module 34, a semantic matching module 35, and a result determination module 36. The functional modules are described in detail as follows:
A data receiving module 31, configured to receive voice response data sent by the client;
A text translation module 32 for performing a voice-to-text process on the voice response data using a voice recognition algorithm to obtain at least two translated text data;
The data scoring module 33 is configured to score the translation text data by using a preset translation evaluation model, and obtain k translation text data with the highest translation score as target data, where k is a positive integer;
an association recall module 34, configured to recall, for each target data, n candidate results associated with the target data from the database by using a set recall policy, where n is a positive integer, and each target data corresponds to a plurality of candidate results in the database;
The semantic matching module 35 is configured to transfer each target data and the recalled candidate result into a twin network model, analyze the text semantic matching degree between the translation text and the candidate result through the twin network model, and obtain a similarity score;
The result determining module 36 is configured to determine a composite score by combining the translation score and the similarity score, rank all the predicted results from high to low according to the composite score, return the predicted result with the highest score as a target result, and obtain translation text data corresponding to the target result as a recognition result.
Optionally, the association recall module 34 includes:
The morpheme analyzing unit is used for carrying out morpheme analysis on each candidate result corresponding to the target data in the database to obtain a plurality of basic morphemes, and the basic morphemes corresponding to the same candidate result are used as a group of basic morphemes;
A first correlation calculation unit for calculating a correlation score of each basic morpheme with respect to each group of basic morphemes;
The second correlation calculation unit is used for carrying out weighted summation on the correlation scores to obtain the correlation scores of the group of basic morphemes and the target data;
And the candidate result determining unit is used for sequencing all the relevance scores according to the sequence from big to small, and selecting n candidate results from front to back as recalled candidate results.
Optionally, the second correlation calculation unit includes:
A calculating subunit, configured to calculate a relevance score of the set of basic morphemes and the target data using the following formula:
The relevance score of the set of base morphemes to the target data is calculated using the following formula:
Wherein Score (Q, d) is a correlation Score between the set of basic morphemes and the target data, Q represents candidate results, Q i represents one morpheme after each candidate result is analyzed, and d is the target data; w represents the weight of morpheme q i; r (q i, d) represents the correlation score of the morpheme q i with the target data d.
Optionally, the candidate result determining unit includes:
The threshold value acquisition subunit is used for acquiring a preset correlation score threshold value;
The comparison subunit is used for comparing the correlation scores corresponding to the n recalled candidate results with a preset correlation score threshold value to obtain comparison results;
And the candidate result updating subunit is used for taking the candidate result as an invalid candidate result and removing the invalid candidate result from the recalled candidate result if the correlation score corresponding to the recalled candidate result in the comparison result is smaller than the preset correlation score threshold value.
Optionally, the data scoring module 33 includes:
a minimum heap construction unit, configured to arbitrarily select, based on a minimum heap Top-k algorithm, translation scores of k translation text data from all translation text data, and establish a minimum heap, where the minimum heap includes a heap Top, which is a minimum score among the translation scores of k translation text data, and unselected translation text data is used as remaining translation data;
the comparison unit is used for selecting any translation score of the residual translation data as a comparison score, and comparing the comparison score with the score of the top of the stack until the residual translation data is selected;
The first execution unit is used for returning to select the score of any residual translation data as the comparison score to continue execution if the comparison score is not greater than the score of the top of the pile;
And the second execution unit is used for taking the comparison score as the score of the new top of the pile if the comparison score is larger than the score of the top of the pile.
Optionally, the semantic matching module 35 includes:
The coding unit is used for coding the translation text and the candidate result by adopting the long-short-time memory neural network pairs respectively to obtain a first code and a second code;
A similarity calculation unit for measuring the spatial similarity of the first code and the second code by using the Manhattan distance;
and a score determining unit for determining a similarity score based on the spatial similarity.
Optionally, the semantic recognition device based on artificial intelligence further comprises:
The annotation data determining module is used for taking the translation text data and the recognition result as new annotation data;
And the updating training module is used for training the twin network model by adopting the new annotation data to obtain an updated twin network model.
For specific limitations on the artificial intelligence based semantic recognition device, reference may be made to the above limitations on the artificial intelligence based semantic recognition method, and no further description is given here. The above-described modules in the artificial intelligence-based semantic recognition apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 4, fig. 4 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It is noted that only a computer device 4 having a component connection memory 41, a processor 42, a network interface 43 is shown in the figures, but it is understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and its hardware includes, but is not limited to, a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), a Programmable gate array (Field-Programmable GATE ARRAY, FPGA), a digital Processor (DIGITAL SIGNAL Processor, DSP), an embedded device, and the like.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 41 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or D interface display memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the computer device 4. Of course, the memory 41 may also comprise both an internal memory unit of the computer device 4 and an external memory device. In this embodiment, the memory 41 is typically used for storing an operating system and various application software installed on the computer device 4, such as program codes for controlling electronic files, etc. Further, the memory 41 may be used to temporarily store various types of data that have been output or are to be output.
The processor 42 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute the program code or process data stored in the memory 41.
The network interface 43 may comprise a wireless network interface or a wired network interface, which network interface 43 is typically used for establishing a communication connection between the computer device 4 and other electronic devices.
The present application also provides another embodiment, namely, a computer-readable storage medium storing an interface display program executable by at least one processor to cause the at least one processor to perform the steps of the information access method as described above.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.
It is apparent that the above-described embodiments are only some embodiments of the present application, but not all embodiments, and the preferred embodiments of the present application are shown in the drawings, which do not limit the scope of the patent claims. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a thorough and complete understanding of the present disclosure. Although the application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing description, or equivalents may be substituted for elements thereof. All equivalent structures made by the content of the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the scope of the application.
Claims (6)
1. The semantic recognition method based on the artificial intelligence is characterized by comprising the following steps of:
Receiving voice response data sent by a client;
performing voice conversion text processing on the voice response data by adopting a voice recognition algorithm to obtain at least two translation text data;
performing translation scoring on the translation text data through a preset translation evaluation model, and acquiring k translation text data with highest translation scoring as target data, wherein k is a positive integer;
Recall n candidate results associated with the target data from a database through a set recall strategy according to each target data, wherein n is a positive integer, each target data corresponds to a plurality of candidate results in the database, and the candidate results are standard sentences which are stored in the database and are the same as the target data context;
transmitting each target data and the recalled candidate result into a twin network model, and analyzing text semantic matching degree between the translation text and the candidate result through the twin network model to obtain a similarity score;
Determining a comprehensive score by combining the translation score and the similarity score, arranging all predicted results from high to low according to the comprehensive score, returning the predicted result with the highest score as a target result, and acquiring translation text data corresponding to the target result as a recognition result;
the recalling the n candidate results associated with the target data from the database through the set recall strategy comprises:
Carrying out morpheme analysis on each candidate result corresponding to the target data in the database to obtain a plurality of basic morphemes, and taking the basic morphemes corresponding to the same candidate result as a group of basic morphemes;
calculating a relevance score of each basic morpheme and the target data;
weighting and summing the relevance scores to obtain a relevance score of the group of basic morphemes and the target data;
Sequencing all the relevance scores according to the sequence from big to small, and selecting n candidate results from front to back as recalled candidate results;
the sorting of all the relevance scores is carried out according to the order from big to small, and n candidate results are selected from front to back and used as recall candidate results, wherein the steps comprise:
Acquiring a preset correlation score threshold;
Comparing the correlation scores corresponding to the n recalled candidate results with the preset correlation score threshold value to obtain a comparison result;
If the correlation score corresponding to the recalled candidate result in the comparison result is smaller than the preset correlation score threshold, the candidate result is used as an invalid candidate result, and the invalid candidate result is removed from the recalled candidate result;
Analyzing the text semantic matching degree between the translation text and the candidate result through the twin network model, and obtaining a similarity score comprises the following steps:
Coding the translation text and the candidate result by adopting a long-short-time memory neural network to obtain a first code and a second code;
measuring the spatial similarity of the first code and the second code by using Manhattan distance;
determining the similarity score based on the spatial similarity;
After obtaining the translation text data corresponding to the target result as the recognition result, the method further comprises:
taking the translated text data and the recognition result as new annotation data;
and training the twin network model by adopting the new annotation data to obtain an updated twin network model.
2. The artificial intelligence based semantic recognition method of claim 1, wherein the weighting and summing the relevance scores to obtain the set of base morphemes and the relevance scores of the target data comprises:
calculating a relevance score for the set of base morphemes to the target data using the formula:
;
Wherein, For the set of base morphemes to correlate scores with the target data,The result of the candidate is indicated as such,Representing a morpheme after each candidate result is analyzed, wherein d is the target data; Sign language element Weights of (2); Sign language element And a relevance score to the target data d.
3. The artificial intelligence based semantic recognition method according to claim 1, wherein the obtaining k translation text data with highest translation scores as target data comprises:
Based on a minimum heap Top-k algorithm, arbitrarily selecting translation scores of k translation text data from all translation text data, and establishing a minimum heap, wherein the minimum heap comprises a heap Top, and the heap Top is the minimum score in the translation scores of the k translation text data, and unselected translation text data is taken as residual translation data;
Selecting a translation score of any one of the remaining translation data as a comparison score, and comparing the comparison score with the score of the top of stack until the remaining translation data is selected;
If the comparison score is not greater than the score of the top of the pile, returning to select any score of the residual translation data as the comparison score to continue execution;
And if the comparison score is larger than the score of the top of the pile, taking the comparison score as the score of the new top of the pile.
4. An artificial intelligence based semantic recognition device, wherein the artificial intelligence based semantic recognition device when executed implements the artificial intelligence based semantic recognition method of any one of claims 1 to 3, comprising:
the data receiving module is used for receiving voice response data sent by the client;
The text translation module is used for carrying out voice conversion text processing on the voice response data by adopting a voice recognition algorithm to obtain at least two translation text data;
the data scoring module is used for scoring the translation text data through a preset translation evaluation model, and obtaining k translation text data with the highest translation score as target data, wherein k is a positive integer;
The associated recall module is used for recalling n candidate results associated with the target data from a database through a set recall strategy, wherein n is a positive integer, each target data corresponds to a plurality of candidate results in the database, and the candidate results are standard sentences which are stored in the database and are the same as the target data context;
The semantic matching module is used for transmitting each target data and the recalled candidate result into a twin network model, and analyzing text semantic matching degree between the translation text and the candidate result through the twin network model to obtain a similarity score;
and the result determining module is used for determining a comprehensive score by combining the translation score and the similarity score, arranging all the predicted results according to the comprehensive score from high to low, returning the predicted result with the highest score as a target result, and acquiring translation text data corresponding to the target result as a recognition result.
5. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the artificial intelligence based semantic recognition method according to any one of claims 1 to 3 when executing the computer program.
6. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the artificial intelligence based semantic recognition method according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111537450.0A CN114218356B (en) | 2021-12-15 | 2021-12-15 | Semantic recognition method, device, equipment and storage medium based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111537450.0A CN114218356B (en) | 2021-12-15 | 2021-12-15 | Semantic recognition method, device, equipment and storage medium based on artificial intelligence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114218356A CN114218356A (en) | 2022-03-22 |
CN114218356B true CN114218356B (en) | 2024-07-26 |
Family
ID=80702536
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111537450.0A Active CN114218356B (en) | 2021-12-15 | 2021-12-15 | Semantic recognition method, device, equipment and storage medium based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114218356B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116610964B (en) * | 2023-07-20 | 2023-09-26 | 之江实验室 | Text similarity matching method and device and computer equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112885338A (en) * | 2021-01-29 | 2021-06-01 | 深圳前海微众银行股份有限公司 | Speech recognition method, apparatus, computer-readable storage medium, and program product |
CN113223516A (en) * | 2021-04-12 | 2021-08-06 | 北京百度网讯科技有限公司 | Speech recognition method and device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113421551B (en) * | 2020-11-16 | 2023-12-19 | 腾讯科技(深圳)有限公司 | Speech recognition method, speech recognition device, computer readable medium and electronic equipment |
CN113436612B (en) * | 2021-06-23 | 2024-02-27 | 平安科技(深圳)有限公司 | Intention recognition method, device, equipment and storage medium based on voice data |
-
2021
- 2021-12-15 CN CN202111537450.0A patent/CN114218356B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112885338A (en) * | 2021-01-29 | 2021-06-01 | 深圳前海微众银行股份有限公司 | Speech recognition method, apparatus, computer-readable storage medium, and program product |
CN113223516A (en) * | 2021-04-12 | 2021-08-06 | 北京百度网讯科技有限公司 | Speech recognition method and device |
Also Published As
Publication number | Publication date |
---|---|
CN114218356A (en) | 2022-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3832519A1 (en) | Method and apparatus for evaluating translation quality | |
CN111241237B (en) | Intelligent question-answer data processing method and device based on operation and maintenance service | |
CN111428010B (en) | Man-machine intelligent question-answering method and device | |
CN111931517B (en) | Text translation method, device, electronic equipment and storage medium | |
CN112468659B (en) | Quality evaluation method, device, equipment and storage medium applied to telephone customer service | |
CN111833845A (en) | Multi-language speech recognition model training method, device, equipment and storage medium | |
CN111695338A (en) | Interview content refining method, device, equipment and medium based on artificial intelligence | |
EP4113357A1 (en) | Method and apparatus for recognizing entity, electronic device and storage medium | |
CN112784573B (en) | Text emotion content analysis method, device, equipment and storage medium | |
CN110808032A (en) | Voice recognition method and device, computer equipment and storage medium | |
CN116541493A (en) | Interactive response method, device, equipment and storage medium based on intention recognition | |
CN112671985A (en) | Agent quality inspection method, device, equipment and storage medium based on deep learning | |
CN115438149A (en) | End-to-end model training method and device, computer equipment and storage medium | |
CN116580704A (en) | Training method of voice recognition model, voice recognition method, equipment and medium | |
CN117271736A (en) | Question-answer pair generation method and system, electronic equipment and storage medium | |
CN114218356B (en) | Semantic recognition method, device, equipment and storage medium based on artificial intelligence | |
CN116645956A (en) | Speech synthesis method, speech synthesis system, electronic device, and storage medium | |
CN115238077A (en) | Text analysis method, device and equipment based on artificial intelligence and storage medium | |
CN113724738B (en) | Speech processing method, decision tree model training method, device, equipment and storage medium | |
CN115062136A (en) | Event disambiguation method based on graph neural network and related equipment thereof | |
CN114758649A (en) | Voice recognition method, device, equipment and medium | |
CN114281969A (en) | Reply sentence recommendation method and device, electronic equipment and storage medium | |
CN114528851A (en) | Reply statement determination method and device, electronic equipment and storage medium | |
CN113283240B (en) | Co-reference digestion method and electronic equipment | |
CN117725153B (en) | Text matching method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |