CN115270818A - Intention identification method and device, storage medium and computer equipment - Google Patents

Intention identification method and device, storage medium and computer equipment Download PDF

Info

Publication number
CN115270818A
CN115270818A CN202210931847.6A CN202210931847A CN115270818A CN 115270818 A CN115270818 A CN 115270818A CN 202210931847 A CN202210931847 A CN 202210931847A CN 115270818 A CN115270818 A CN 115270818A
Authority
CN
China
Prior art keywords
sentence
similarity value
intention
text information
standard
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210931847.6A
Other languages
Chinese (zh)
Inventor
徐华韫
黄明星
王福钋
曹富康
张航飞
董婉
沈鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Absolute Health Ltd
Original Assignee
Beijing Absolute Health Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Absolute Health Ltd filed Critical Beijing Absolute Health Ltd
Priority to CN202210931847.6A priority Critical patent/CN115270818A/en
Publication of CN115270818A publication Critical patent/CN115270818A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an intention identification method and device, a storage medium and computer equipment, relates to the technical field of artificial intelligence, and mainly aims to solve the problem of low accuracy of intention identification caused by category limitation in the existing intention classification method. The method comprises the following steps: acquiring text information of the intention to be identified; if no pre-matching item is detected in the text information, performing semantic similarity processing on the text information and standard sentences containing target intentions and similar sentences expanded by the standard sentences respectively to obtain a first similarity value and a second similarity value; determining the intention of the text information from the standard sentence and the similar sentence based on the comparison result between the first similarity value and the second similarity value.

Description

Intention identification method and device, storage medium and computer equipment
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an intention identification method and device, a storage medium and computer equipment.
Background
With the continuous development of artificial intelligence technology, natural language recognition and understanding technology is applied to production and life, wherein intention recognition plays an important role in more and more fields, such as conversation products like intelligent customer service.
At present, the traditional intention identification method mainly classifies users after talk-art preprocessing by directly placing the users into an intention classification model, and the intention classification model classifies the intentions according to the similarity threshold of each intention category to obtain the intention classification result. However, in the conventional intention recognition method, it is necessary to classify the intentions that the user may propose in advance, and to set a similarity threshold for each intention. For a relatively open field, there is a limitation in classifying the user intentions which may appear in advance, and the classification defined in advance cannot completely cover the user intentions, so that the user intentions are reluctantly merged into the limited intent classification defined in advance, and the accuracy of intent recognition is reduced.
Disclosure of Invention
In view of the above, the present invention provides an intention identifying method and apparatus, a storage medium, and a computer device, and mainly aims to solve the problem of low accuracy of intention identification due to class limitation in the existing intention classifying method.
According to an aspect of the present invention, there is provided an intention identifying method including:
acquiring text information of the intention to be identified;
if no pre-matching item is detected in the text information, performing semantic similarity processing on the text information and a standard sentence containing a target intention and a similar sentence expanded by each standard sentence respectively to obtain a first similarity value and a second similarity value, wherein the pre-matching item contains matching objects with different matching rules;
determining the intention of the text information from the standard sentence and the similar sentence based on the comparison result between the first similarity value and the second similarity value.
Further, after the text information of the intention to be recognized is obtained, the method further includes:
performing word segmentation processing on the text information to obtain word segments in the text information;
detecting the participles based on matching objects in the preposed matching items; the matching object comprises keywords and/or words and characters applicable to the matching rule; the matching rule is a rule for semantic matching;
if the matching object is detected in the segmentation, determining the intention of the text information based on the segmentation.
Further, before the semantic similarity processing is performed on the text information and the standard sentences containing the target intentions and the similar sentences expanded by the standard sentences respectively to obtain the first similarity value and the second similarity value, the method further includes:
obtaining a standard sentence with a target intention, and performing semantic similarity expansion on one or more of grammar, word order and words of the standard sentence to obtain a plurality of similar sentences; the similar sentence characterizes the same target intent as the standard sentence;
establishing an intention recognition library of a target field based on the standard sentences and the similar sentences;
and converting the standard sentences and the similar sentences into first sentence vectors and second sentence vectors respectively by adopting a language model.
Further, before the converting the standard sentence and the similar sentence into the first sentence vector and the second sentence vector respectively by using the language model, the method further includes:
and training the language model by adopting text corpora to obtain the language model for sentence vector conversion.
Further, the obtaining a first similarity value and a second similarity value by performing semantic similarity processing on the text information and standard sentences containing the target intention and similar sentences expanded by the standard sentences respectively includes:
converting the text information into a third sentence vector by adopting the language model;
calculating the semantic similarity between the first sentence vector and the third sentence vector by adopting cosine similarity to obtain a first similarity value;
and calculating the semantic similarity between the second sentence vector and the third sentence vector by adopting the cosine similarity to obtain a second similarity value.
Further, the method further comprises:
randomly selecting two sentences from the intention recognition library to carry out semantic similarity processing to obtain a third similarity value;
converting the two sentences into a fourth sentence vector and a fifth sentence vector respectively by adopting the language model, and calculating the absolute value of the difference between the fourth sentence vector and the fifth sentence vector to obtain a selected difference vector;
performing semantic similarity processing on the spliced vector spliced by the fourth sentence vector, the fifth sentence vector and the selected difference vector to obtain a fourth similarity value;
and taking the difference value between the third similarity value and the fourth similarity value as a training error of the language model, and adjusting parameters in the language model by adopting an optimizer based on the training error so as to finish the updating and adjusting of the language model.
Further, the determining the intention of the text information from the standard sentence and the similar sentence based on the comparison result between the first similarity value and the second similarity value comprises:
comparing the first similarity value with the second similarity value to obtain a maximum similarity value;
if the maximum similarity value is the first similarity value, determining a first target standard sentence corresponding to the first similarity value as the intention of the text information;
and if the maximum similarity value is the second similarity value, determining a target similar sentence according to the second similarity value, determining an expanded second target standard sentence based on the target similar sentence, and determining the second target standard sentence as the intention of the text information.
According to another aspect of the present invention, there is provided an intention recognition apparatus including:
the information acquisition module is used for acquiring text information of the intentions to be identified;
the similarity processing module is used for respectively carrying out semantic similarity processing on the text information and standard sentences containing target intentions and similar sentences expanded by the standard sentences to obtain a first similarity value and a second similarity value if a pre-matching item is not detected in the text information, wherein the pre-matching item contains matching objects with different matching rules;
an intention determining module, configured to determine an intention of the text information from the standard sentence and the similar sentence based on a comparison result between the first similarity value and the second similarity value.
According to still another aspect of the present invention, there is provided a storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the above-described intention recognition method.
According to still another aspect of the present invention, there is provided a computer device, including a processor, a memory, a communication interface, and a communication bus, wherein the processor, the memory, and the communication interface communicate with each other via the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the intention identification method in any item.
By the technical scheme, the technical scheme provided by the embodiment of the invention at least has the following advantages:
compared with the prior art that an intention classification model is adopted to classify the intention into preset limited categories, the intention identification method and device, a storage medium and computer equipment provided by the invention acquire the text information of the intention to be identified; if no pre-matching item is detected in the text information, performing semantic similarity processing on the text information and a standard sentence containing a target intention and a similar sentence expanded by each standard sentence respectively to obtain a first similarity value and a second similarity value, wherein the pre-matching item contains matching objects with different matching rules; and determining the intention of the text information from the standard sentence and the similar sentence based on a comparison result between the first similarity value and the second similarity value, so that a target intention knowledge base is constructed, the identified intention is based on the constructed target intention knowledge base, the intention is prevented from being classified into preset limited categories, and the accuracy of intention identification is improved. When the sentences in the knowledge base are processed, the semantic information is converted into sentence vectors, the semantic information of the sentences can be completely processed and analyzed, and the requirement on intention identification in the open field is met.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flow chart of an intent recognition method provided by an embodiment of the present invention;
FIG. 2 is a flow chart illustrating an intent recognition method provided by an embodiment of the present invention;
FIG. 3 is a flow chart illustrating an intent recognition method provided by an embodiment of the present invention;
FIG. 4 is a flow chart of an intent recognition method provided by embodiments of the present invention;
FIG. 5 is a flow chart of an intent recognition method provided by embodiments of the present invention;
FIG. 6 is a flow chart illustrating an intent recognition method provided by an embodiment of the present invention;
FIG. 7 is a schematic diagram illustrating an intention identifying apparatus according to an embodiment of the present invention;
fig. 8 shows a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
An embodiment of the present invention provides an intention identifying method, as shown in fig. 1, the method including:
101. acquiring text information of the intention to be identified;
in the embodiment of the present invention, the current execution end obtains the text information of the intention to be recognized, and the way of obtaining the text information includes, but is not limited to, text information directly input by a user, text information converted into characters after voice input by the user, digital text information obtained by scanning a paper text, and the like.
102. If no pre-matching item is detected in the text information, performing semantic similarity processing on the text information and standard sentences containing target intentions and similar sentences expanded by the standard sentences respectively to obtain a first similarity value and a second similarity value;
in the embodiment of the invention, a current execution end detects whether a pre-matching item is contained in text information, wherein the pre-matching item contains matching objects of different matching rules; possible matching objects include, but are not limited to, keywords and/or words, characters to which the matching rules apply; the matching rule is a rule for semantic matching; the keyword may be a malicious word, and when the text information is detected to include the malicious word, the intention of the user is recognized as having a malicious word based on the malicious word, resulting in an intention recognition result. The rule may be a composition of words and characters, such as "go.. Office" or "find.. Office", and "find" and "office" in between, and "find" in between, are omitted characters, when it is detected that "go" and "office" are included in the text information, or it is detected that "find" and "do" are included in the text information, and a plurality of characters are added in between two words, the current execution end recognizes the intention of the user as a complaint based on the detected rule, and obtains an intention recognition result, a specific keyword setting, and a rule setting, which is not specifically limited in the embodiment of the present invention.
It should be noted that, when the pre-matching item is not detected by the current execution end, semantic similarity processing is performed on the text information and the standard sentence to obtain a first similarity value; and meanwhile, performing semantic similarity processing on the text information and similar sentences expanded by the standard sentences to obtain a second similarity value. The standard sentence is a sentence representing a target intention, such as "unintelligible", "good", and the like, and the embodiment of the present invention is not particularly limited. The similar sentences are sentences expanded by the standard sentences, the target intentions are represented by the same objects as the standard sentences, for example, when the standard sentences are not understood, the standard sentences can be expanded to obtain "i do not understand", "what you say" and the like. The similarity process is a process of calculating a similarity value between two text messages.
103. Determining the intention of the text information from the standard sentence and the similar sentence based on the comparison result between the first similarity value and the second similarity value.
In the embodiment of the invention, the calculated similarity value can represent the similarity between the text information with the intention to be identified and the standard sentence and the similar sentence, so that the current execution end compares the first similarity value with the second similarity value after obtaining the first similarity value and the second similarity value, thereby obtaining the maximum similarity value, the sentence corresponding to the maximum similarity value can represent the real intention of the text information to be identified most, and the current execution end determines the intention of the text information by the sentence corresponding to the maximum similarity value.
Further, as a refinement and an extension of the specific implementation of the above embodiment, in order to improve the efficiency of intention recognition, the intention with obvious features is screened, and the intention with obvious features is recognized in advance, another intention recognition method is provided, after the step of obtaining the text information of the intention to be recognized, the method further includes:
201. performing word segmentation processing on the text information to obtain word segments in the text information;
in the embodiment of the invention, the current execution end carries out word segmentation processing on the text information, and the sentence is divided into a plurality of words to obtain each word segmentation in the text information. The word segmentation processing method includes, but is not limited to, a reverse maximum matching algorithm, a method of setting segmentation flags, a bidirectional maximum matching method, and the like, and the embodiment of the present invention is not particularly limited.
202. Detecting the participles based on the matching objects in the preposed matching items;
in the embodiment of the present invention, before performing word segmentation detection, a pre-matching item is set, where the pre-matching item includes multiple matching objects, which may be keywords, or may be a matching rule including words and characters, and the embodiment of the present invention is not particularly limited. When detecting a word based on a pre-matching item, all matching objects in the pre-matching item need to be detected in the word segmentation until all the word segmentation is detected.
203. If the matching object is detected in the segmentation, determining an intention of the text information based on the segmentation.
In the embodiment of the present invention, if the current execution end detects a matching object in the segmented word, such as a detected keyword or a detected matching rule containing words and characters, the embodiment of the present invention is not specifically limited, and the intention of the text information is determined based on the matching object detected in the segmented word. If a hostile keyword is detected in the segmented words, the current execution end recognizes the intention of the text information as hostile based on the hostile keyword detected in the segmented words, and obtains an intention recognition result. If a matching rule consisting of a word and a character is detected in a participle, such as "go.. Office" or "find.. Office", "go" and "office" middle, and "find" middle ellipses are omitted characters, the current execution end recognizes an intention of the text information as a complaint based on the matching rule detected in the participle, and obtains an intention recognition result, which is not specifically limited in the embodiment of the present invention.
Further, as refinement and expansion of the specific implementation of the above embodiment, in order to avoid limitation of target intent classification, meet fine granularity requirements of intent recognition, and improve accuracy of intent recognition, another intent recognition method is provided, where the text information is subjected to semantic similarity processing with a standard sentence containing a target intent and a similar sentence expanded by each standard sentence, respectively, and before obtaining a first similarity value and a second similarity value, the method further includes:
301. obtaining a standard sentence with a target intention, and performing semantic similarity expansion on one or more of grammar, word order and words of the standard sentence to obtain a plurality of similar sentences;
in the embodiment of the present invention, before the current execution end establishes the intention recognition library in the target field, a standard sentence of the target intention in the target field is set, such as "not understood", "good", "how long it takes", "can apply", and the like, and the embodiment of the present invention is not limited specifically. When the standard sentence "how long is needed" is expanded, semantic similar expansion can be performed on the grammar of the standard sentence, for example, "how long i still need", how long i need "the language order of the standard sentence, and the words of the standard sentence can be expanded, for example," how long it needs ", and the like. Similar sentences such as how long I still needs, how long I needs and the like obtained by expanding the standard sentence which needs to be long are characterized by the same target intention as the standard sentence which needs to be long.
It should be noted that, the standard sentences of the target intention and the similar sentences subjected to semantic expansion may include answer sentences of questionnaires, conversation sentences of communication devices such as telephones and wechat, user text input sentences of human-computer interaction devices, or sentences input by voice, and the like, and the embodiment of the present invention is not particularly limited. Therefore, the applicable scenarios of the embodiment of the present invention include, but are not limited to, telephone sales, user return visits, intelligent robots, terminal devices with a human-computer interaction function, and the like, and the embodiment of the present invention is not particularly limited.
302. Establishing an intention recognition library of a target field based on the standard sentences and the similar sentences;
in the embodiment of the present invention, the current execution end establishes an intention recognition library in the target field based on the standard statement and the similar statement, and the storage form of the statement in the intention recognition library includes, but is not limited to, a compressed packet form, a TXT file form, a database form, and the like. The intention recognition library can be stored locally at the current execution end, in a background server, or in a cloud server.
303. And converting the standard sentences and the similar sentences into first sentence vectors and second sentence vectors respectively by adopting a language model.
In the embodiment of the invention, the standard sentences and the similar sentences in the intention recognition library are preprocessed before the current execution end carries out similarity processing, namely, the standard sentences and the similar sentences are converted into sentence vectors by adopting a language model. In the embodiment of the invention, the language model adopts a bert-wwm model to perform vector conversion on the standard statement and the similar statement to obtain a first sentence vector and a second sentence vector. The sentence vector is used for representing the vector representation of the text information after full-text semantic information is fused.
Further, as a refinement and an extension of the specific implementation of the above embodiment, in order to make the sentence vectors obtained by the language model more accurate, another method for identifying intent is provided, before the step of converting the standard sentences and the similar sentences into the first sentence vectors and the second sentence vectors respectively by using the language model, the method further includes:
and training the language model by adopting text corpora to obtain the language model for sentence vector conversion.
In the embodiment of the invention, the current execution end adopts a large amount of text corpus information to train the bert-wm model, so that the performance of the bert-wm model is more stable, the obtained sentence vector is more accurate, and the language model for sentence vector conversion is obtained after training.
Further, as a refinement and an extension of the specific implementation of the above embodiment, in order to fully explain the specific implementation process of the similarity processing, obtaining the similarity between the text information of the intention to be recognized and each sentence in the intention recognition library, and more accurately expressing the true intention of the text information of the intention to be recognized, another intention recognition method is provided, in which the steps of performing semantic similarity processing on the text information and a standard sentence containing the target intention and a similar sentence expanded by each standard sentence respectively to obtain a first similarity value and a second similarity value include:
401. converting the text information into a third sentence vector by adopting the language model;
in the embodiment of the present invention, the current execution end uses the language model to convert the text information into the third sentence vector, and the language model used may be a bert-wwm model used for sentence vector conversion after training.
402. Calculating the semantic similarity between the first sentence vector and the third sentence vector by adopting cosine similarity to obtain a first similarity value;
403. and calculating the semantic similarity between the second sentence vector and the third sentence vector by adopting the cosine similarity to obtain a second similarity value.
In the embodiment of the invention, the current execution end adopts cosine similarity to calculate the semantic similarity between the first sentence vector and the third sentence vector, so as to obtain a first similarity value. And calculating the semantic similarity between the second sentence vector and the third sentence vector by adopting the cosine similarity to obtain a second similarity value. The cosine similarity is used for representing a cosine value of an included angle between two vectors and is used for measuring the difference between the two vectors, and the more the cosine value of the calculated included angle is close to 1, the closer the angle of the included angle between the two vectors is to 0 degree, namely the more similar the two vectors are. During calculation, the similarity value obtained is closer to the true value based on the sentence vector after full-text semantic information is fused.
Further, as a refinement and an extension of the specific implementation of the above embodiment, in order to perform error correction on the language model and obtain a more accurate sentence vector, another intention identification method is provided, where the method further includes:
501. randomly selecting two sentences from the intention recognition library to carry out semantic similarity processing to obtain a third similarity value;
in the embodiment of the invention, the current execution end arbitrarily selects two sentences from the intention recognition library to carry out semantic similarity processing, so as to obtain a third similarity value. When semantic similarity is processed, a known similarity processing model is used for processing, such as a roformer-sim model, and the embodiment of the present invention is not particularly limited. The similarity of the two sentences of which the third similarity value is calculated is determined as a standard similarity value, in the embodiment of the invention, the two sentences corresponding to the standard similarity can be marked with soft labels, and the soft labels are directly output during training output. If the similarity value calculated by arbitrarily selecting two sentences is 0.9, then the corresponding two sentences are labeled with a soft label of 0.9, and if the similarity value calculated by arbitrarily selecting two sentences is 0.85, then the corresponding two sentences are labeled with a soft label of 0.85.
502. Converting the two sentences into a fourth sentence vector and a fifth sentence vector respectively by adopting the language model, and calculating the absolute value of the difference between the fourth sentence vector and the fifth sentence vector to obtain a selected difference vector;
in the embodiment of the present invention, the current execution end uses a language model to convert two arbitrarily selected sentences into a fourth sentence vector and a fifth sentence vector, and then calculates an absolute value of a difference between the fourth sentence vector and the fifth sentence vector to obtain a selected difference vector, where, for example, the fourth sentence vector is denoted as a vector u, the fifth sentence vector is denoted as a vector v, and the absolute value of the difference between the two vectors is | u-v |, that is, the obtained selected difference vector is | u-v |, which is not specifically limited in the embodiment of the present invention.
503. Performing semantic similarity processing on the spliced vector spliced by the fourth sentence vector, the fifth sentence vector and the selected difference vector to obtain a fourth similarity value;
in the embodiment of the present invention, the current execution end splices the fourth sentence vector, the fifth sentence vector, and the selected difference vector to form a spliced vector, and then processes the spliced vector by using a known similarity processing model, such as a roformer-sim model.
504. And taking the difference value between the third similarity value and the fourth similarity value as a training error of the language model, and adjusting parameters in the language model by adopting an optimizer based on the training error so as to complete the updating and adjustment of the language model.
In the embodiment of the invention, the current execution end calculates the difference value between the third similarity value and the fourth similarity value, and the difference value is used as the training error of the language model. In the updating stage of the language model, the training error may be compared with the soft label output by the model, the average value of the cross entropy loss function of the model is calculated based on the training error, and then the optimizer is used to adjust the parameters in the language model, including the qkv matrix of the multi-head attention mechanism in bert, the parameter matrix of the forward neural network, and the like. And after multiple adjustments, the training error of the language model is within the range of the preset error threshold value, and the updating adjustment of the language model is completed.
Further, as a refinement and an extension of the specific implementation of the above embodiment, in order to make the result of the intention recognition more accurate and the output result of the intention recognition has a uniform expression, another intention recognition method is provided, wherein the determining the intention of the text information from the standard sentence and the similar sentence based on the comparison result between the first similarity value and the second similarity value includes:
601. comparing the first similarity value with the second similarity value to obtain a maximum similarity value;
in the embodiment of the invention, the current execution end compares the first similarity value with the second similarity value, and selects the maximum value from a set consisting of the first similarity value and the second similarity value to obtain the maximum similarity value. The maximum similarity value can represent the true intention of the text information.
602. If the maximum similarity value is the first similarity value, determining a first target standard sentence corresponding to the first similarity value as the intention of the text message;
603. and if the maximum similarity value is the second similarity value, determining a target similar sentence according to the second similarity value, determining an expanded second target standard sentence based on the target similar sentence, and determining the second target standard sentence as the intention of the text information.
In the embodiment of the invention, the current execution end judges the obtained maximum similarity value, and if the maximum similarity value is the first similarity value, the first target standard sentence corresponding to the first similarity value is determined as the intention of the text message; if the maximum similarity value is the first similarity value, the intention of the text information is determined from standard sentences such as "not understand", "good", "how long it takes", "can apply", and the like; if the maximum similarity value is the second similarity value, the target similar sentence is determined from the similar sentences, for example, the target similar sentence is determined as "not understood" from the similar sentences, such as "i do not understand", "what you said" and "what you said", and the standard sentence corresponding to the "not understood" is determined as the target standard sentence, and the "not understood" is determined as the intention of the text message, which is not limited in the embodiment of the present invention.
The invention provides an intention identification method, which comprises the steps of acquiring text information of an intention to be identified; if no pre-matching item is detected in the text information, performing semantic similarity processing on the text information and a standard sentence containing a target intention and a similar sentence expanded by each standard sentence respectively to obtain a first similarity value and a second similarity value, wherein the pre-matching item contains matching objects with different matching rules; and determining the intention of the text information from the standard sentence and the similar sentence based on a comparison result between the first similarity value and the second similarity value, so that a target intention knowledge base is constructed, the identified intention is based on the constructed target intention knowledge base, the intention is prevented from being classified into preset limited categories, and the accuracy of intention identification is improved. When the sentences in the knowledge base are processed, the semantic information is converted into sentence vectors, the semantic information of the sentences can be completely processed and analyzed, and the requirement on intention identification in the open field is met.
As an implementation of the method shown in fig. 1, an embodiment of the present invention provides an intention identifying apparatus, as shown in fig. 7, including:
the information acquisition module 71 is configured to acquire text information of the intention to be recognized;
a similarity processing module 72, configured to, if a pre-matching item is not detected in the text information, perform semantic similarity processing on the text information and a standard sentence containing a target intention and a similar sentence expanded by each standard sentence, respectively, to obtain a first similarity value and a second similarity value, where the pre-matching item includes matching objects of different matching rules;
an intention determining module 73, configured to determine an intention of the text information from the standard sentence and the similar sentence based on a comparison result between the first similarity value and the second similarity value.
Further, the apparatus further comprises:
the word segmentation processing module is used for carrying out word segmentation processing on the text information to obtain words in the text information;
the preposed detection module is used for detecting the participles based on matching objects in the preposed matching items; the matching object comprises keywords and/or words and characters applicable to the matching rule; the matching rule is a rule for semantic matching;
a preposition determination module for determining an intention of the text information based on the segmentation if the matching object is detected in the segmentation.
Further, the apparatus further comprises:
the sentence acquisition module is used for acquiring a standard sentence with a target intention, and performing semantic similar expansion on one or more of grammar, word order and words of the standard sentence to obtain a plurality of similar sentences; the similar sentence characterizes the same target intent as the standard sentence;
the recognition library establishing module is used for establishing an intention recognition library of the target field based on the standard sentences and the similar sentences;
and the sentence conversion module is used for converting the standard sentence and the similar sentence into a first sentence vector and a second sentence vector respectively by adopting a language model.
Further, the apparatus further comprises:
and the model training module is used for training the language model by adopting text corpora to obtain the language model for sentence vector conversion.
Further, the similarity processing module 72 further includes:
the sentence conversion unit is used for converting the text information into a third sentence vector by adopting the language model;
the similarity calculation unit is used for calculating the semantic similarity between the first sentence vector and the third sentence vector by adopting cosine similarity to obtain a first similarity value; and calculating the semantic similarity between the second sentence vector and the third sentence vector by adopting the cosine similarity to obtain a second similarity value.
Further, the apparatus further comprises:
the model updating module is used for randomly selecting two sentences from the intention recognition library to carry out semantic similarity processing to obtain a third similarity value;
converting the two sentences into a fourth sentence vector and a fifth sentence vector respectively by adopting the language model, and calculating the absolute value of the difference between the fourth sentence vector and the fifth sentence vector to obtain a selected difference vector;
performing semantic similarity processing on the spliced vector spliced by the fourth sentence vector, the fifth sentence vector and the selected difference vector to obtain a fourth similarity value;
and taking the difference value between the third similarity value and the fourth similarity value as a training error of the language model, and adjusting parameters in the language model by adopting an optimizer based on the training error so as to finish the updating and adjusting of the language model.
Further, the intention determining module 73 further includes:
the comparison unit is used for comparing the first similarity value with the second similarity value to obtain a maximum similarity value;
a target intention determining unit, configured to determine, if the maximum similarity value is the first similarity value, a first target standard sentence corresponding to the first similarity value as an intention of the text information;
and if the maximum similarity value is the second similarity value, determining a target similar sentence according to the second similarity value, determining an expanded second target standard sentence based on the target similar sentence, and determining the second target standard sentence as the intention of the text information.
The invention provides an intention identification device, which is characterized in that the embodiment of the invention obtains the text information of the intention to be identified; if no pre-matching item is detected in the text information, performing semantic similarity processing on the text information and a standard sentence containing a target intention and a similar sentence expanded by each standard sentence respectively to obtain a first similarity value and a second similarity value, wherein the pre-matching item contains matching objects with different matching rules; and determining the intention of the text information from the standard sentence and the similar sentence based on a comparison result between the first similarity value and the second similarity value, so that a target intention knowledge base is constructed, the identified intention is based on the constructed target intention knowledge base, the intention is prevented from being classified into preset limited categories, and the accuracy of intention identification is improved. When the sentences in the knowledge base are processed, the semantic information is converted into sentence vectors, the semantic information of the sentences can be completely processed and analyzed, and the requirement on intention identification in the open field is met.
According to an embodiment of the present invention, there is provided a storage medium storing at least one executable instruction, the computer executable instruction being capable of executing the intention identifying method in any of the above-described method embodiments.
Fig. 8 is a schematic structural diagram of another computer device according to an embodiment of the present invention, where the specific embodiment of the present invention does not limit the specific implementation of the computer device.
As shown in fig. 8, the computer apparatus may include: a processor (processor) 802, a communication Interface 804, a memory 806, and a communication bus 808.
Wherein: the processor 802, communication interface 804, and memory 806 communicate with one another via a communication bus 808.
A communication interface 804 for communicating with network elements of other devices, such as clients or other servers.
The processor 802 is configured to execute the program 810, and may specifically perform relevant steps in the above data processing method embodiments.
In particular, the program 810 may include program code comprising computer operating instructions.
The processor 802 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement an embodiment of the invention. The computer device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
The memory 806 stores a program 810. The memory 806 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 810 may be specifically configured to cause the processor 802 to perform the following operations:
acquiring text information of the intention to be identified;
if no pre-matching item is detected in the text information, performing semantic similarity processing on the text information and a standard sentence containing a target intention and a similar sentence expanded by each standard sentence respectively to obtain a first similarity value and a second similarity value, wherein the pre-matching item contains matching objects with different matching rules;
determining the intention of the text information from the standard sentence and the similar sentence based on the comparison result between the first similarity value and the second similarity value.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An intent recognition method, comprising:
acquiring text information of the intention to be identified;
if no pre-matching item is detected in the text information, performing semantic similarity processing on the text information and a standard sentence containing a target intention and a similar sentence expanded by each standard sentence respectively to obtain a first similarity value and a second similarity value, wherein the pre-matching item contains matching objects with different matching rules;
determining the intention of the text information from the standard sentence and the similar sentence based on the comparison result between the first similarity value and the second similarity value.
2. The method according to claim 1, wherein after the obtaining of the text information of the intent to be recognized, the method further comprises:
performing word segmentation processing on the text information to obtain word segments in the text information;
detecting the participles based on matching objects in the preposed matching items; the matching object comprises keywords and/or words and characters applicable to the matching rule; the matching rule is a rule for semantic matching;
if the matching object is detected in the segmentation, determining the intention of the text information based on the segmentation.
3. The method according to claim 1, wherein before the semantic similarity processing is performed on the text information and the standard sentences containing the target intentions and the similar sentences expanded by the standard sentences respectively to obtain the first similarity value and the second similarity value, the method further comprises:
acquiring a standard sentence with a target intention, and performing semantic similar expansion on one or more items in grammar, word order and words of the standard sentence to obtain a plurality of similar sentences; the similar sentence characterizes the same target intent as the standard sentence;
establishing an intention recognition library of a target field based on the standard sentences and the similar sentences;
and converting the standard sentences and the similar sentences into first sentence vectors and second sentence vectors respectively by adopting a language model.
4. The method of claim 3, wherein prior to using the language model to convert the standard sentence and the similar sentence into a first sentence vector and a second sentence vector, respectively, the method further comprises:
and training the language model by adopting text corpora to obtain the language model for sentence vector conversion.
5. The method of claim 3, wherein the obtaining the first similarity value and the second similarity value by performing semantic similarity processing on the text information and a standard sentence containing the target intention and a similar sentence expanded by each standard sentence respectively comprises:
converting the text information into a third sentence vector by adopting the language model;
calculating the semantic similarity between the first sentence vector and the third sentence vector by adopting cosine similarity to obtain a first similarity value;
and calculating the semantic similarity between the second sentence vector and the third sentence vector by adopting the cosine similarity to obtain a second similarity value.
6. The method of claim 3, further comprising:
randomly selecting two sentences from the intention recognition library to carry out semantic similarity processing to obtain a third similarity value;
converting the two sentences into a fourth sentence vector and a fifth sentence vector respectively by adopting the language model, and calculating the absolute value of the difference between the fourth sentence vector and the fifth sentence vector to obtain a selected difference vector;
performing semantic similarity processing on the spliced vector spliced by the fourth sentence vector, the fifth sentence vector and the selected difference vector to obtain a fourth similarity value;
and taking the difference value between the third similarity value and the fourth similarity value as a training error of the language model, and adjusting parameters in the language model by adopting an optimizer based on the training error so as to finish the updating and adjusting of the language model.
7. The method of claim 1, wherein the determining the intent of the text information from the standard sentence and the similar sentence based on the comparison result between the first similarity value and the second similarity value comprises:
comparing the first similarity value with the second similarity value to obtain a maximum similarity value;
if the maximum similarity value is the first similarity value, determining a first target standard sentence corresponding to the first similarity value as the intention of the text message;
and if the maximum similarity value is the second similarity value, determining a target similar sentence according to the second similarity value, determining an expanded second target standard sentence based on the target similar sentence, and determining the second target standard sentence as the intention of the text information.
8. An intention recognition apparatus, comprising:
the information acquisition module is used for acquiring text information of the intentions to be identified;
the similarity processing module is used for respectively carrying out semantic similarity processing on the text information and standard sentences containing target intentions and similar sentences expanded by the standard sentences to obtain a first similarity value and a second similarity value if a pre-matching item is not detected in the text information, wherein the pre-matching item contains matching objects with different matching rules;
an intention determining module, configured to determine an intention of the text information from the standard sentence and the similar sentence based on a comparison result between the first similarity value and the second similarity value.
9. A storage medium having stored therein at least one executable instruction that performs an operation corresponding to the intent recognition method of any of claims 1-7.
10. A computer device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface are communicated with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the intention identification method of any one of claims 1-7.
CN202210931847.6A 2022-08-04 2022-08-04 Intention identification method and device, storage medium and computer equipment Pending CN115270818A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210931847.6A CN115270818A (en) 2022-08-04 2022-08-04 Intention identification method and device, storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210931847.6A CN115270818A (en) 2022-08-04 2022-08-04 Intention identification method and device, storage medium and computer equipment

Publications (1)

Publication Number Publication Date
CN115270818A true CN115270818A (en) 2022-11-01

Family

ID=83749601

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210931847.6A Pending CN115270818A (en) 2022-08-04 2022-08-04 Intention identification method and device, storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN115270818A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116340523A (en) * 2023-05-30 2023-06-27 北京中关村科金技术有限公司 Session intention recognition method and device, computer equipment, storage medium and software

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116340523A (en) * 2023-05-30 2023-06-27 北京中关村科金技术有限公司 Session intention recognition method and device, computer equipment, storage medium and software
CN116340523B (en) * 2023-05-30 2023-09-26 北京中关村科金技术有限公司 Session intention recognition method and device, computer equipment, storage medium and software

Similar Documents

Publication Publication Date Title
CN111859960B (en) Semantic matching method, device, computer equipment and medium based on knowledge distillation
WO2022142041A1 (en) Training method and apparatus for intent recognition model, computer device, and storage medium
CN111159409B (en) Text classification method, device, equipment and medium based on artificial intelligence
CN113094578A (en) Deep learning-based content recommendation method, device, equipment and storage medium
WO2023045184A1 (en) Text category recognition method and apparatus, computer device, and medium
CN112507704A (en) Multi-intention recognition method, device, equipment and storage medium
CN111274822A (en) Semantic matching method, device, equipment and storage medium
CN112668341B (en) Text regularization method, apparatus, device and readable storage medium
CN111368066B (en) Method, apparatus and computer readable storage medium for obtaining dialogue abstract
CN111831826A (en) Training method, classification method and device of cross-domain text classification model
CN113051380B (en) Information generation method, device, electronic equipment and storage medium
CN110633475A (en) Natural language understanding method, device and system based on computer scene and storage medium
CN112101003B (en) Sentence text segmentation method, device and equipment and computer readable storage medium
CN115270818A (en) Intention identification method and device, storage medium and computer equipment
CN113486143A (en) User portrait generation method based on multi-level text representation and model fusion
CN114842301A (en) Semi-supervised training method of image annotation model
CN114117069A (en) Semantic understanding method and system for intelligent knowledge graph question answering
CN114528851A (en) Reply statement determination method and device, electronic equipment and storage medium
CN114297380A (en) Data processing method, device, equipment and storage medium
CN114218431A (en) Video searching method and device, electronic equipment and storage medium
CN115129856A (en) Emotion information fused intention identification method and device, storage medium and computer equipment
CN113255368A (en) Method and device for emotion analysis of text data and related equipment
CN112632229A (en) Text clustering method and device
CN114579740B (en) Text classification method, device, electronic equipment and storage medium
CN114861632B (en) Text emotion recognition method based on ALBERT-BiLSTM model and SVM-NB classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100102 201 / F, block C, 2 lizezhong 2nd Road, Chaoyang District, Beijing

Applicant after: Beijing Shuidi Technology Group Co.,Ltd.

Address before: 100102 201, 2 / F, block C, No.2 lizezhong 2nd Road, Chaoyang District, Beijing

Applicant before: Beijing Health Home Technology Co.,Ltd.

CB02 Change of applicant information