CA3174601C - Text intent identifying method, device, computer equipment and storage medium - Google Patents

Text intent identifying method, device, computer equipment and storage medium Download PDF

Info

Publication number
CA3174601C
CA3174601C CA3174601A CA3174601A CA3174601C CA 3174601 C CA3174601 C CA 3174601C CA 3174601 A CA3174601 A CA 3174601A CA 3174601 A CA3174601 A CA 3174601A CA 3174601 C CA3174601 C CA 3174601C
Authority
CA
Canada
Prior art keywords
text
intent
preset
similarity
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CA3174601A
Other languages
French (fr)
Other versions
CA3174601A1 (en
Inventor
Liangliang XIN
Heqiang NI
Yun Bai
Yingbo Pan
Qiang Sun
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
10353744 Canada Ltd
Original Assignee
10353744 Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 10353744 Canada Ltd filed Critical 10353744 Canada Ltd
Publication of CA3174601A1 publication Critical patent/CA3174601A1/en
Application granted granted Critical
Publication of CA3174601C publication Critical patent/CA3174601C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A text intent recognition method and apparatus, a computer device and a storage medium. The method comprises: acquiring text to be processed (S102); inputting the text into a text classification model to obtain a similar corpus of the text outputted by the text classification model and a first similarity between the similar corpus and the text, and training the text classification model according to a corpus that has annotated intent (S104); determining first candidate intent of the text according to the similar corpus (S106); extracting entity information of the text, and acquiring second candidate intent of the text according to the entity information (S108); acquiring a second similarity between the entity information and the text (S110); and screening the final intent of the text from the first candidate intent and the second candidate intent according to the first similarity and the second similarity (S112). The described method is able to improve the accuracy of text content intent recognition.

Description

TEXT INTENT IDENTIFYING METHOD, DEVICE, COMPUTER EQUIPMENT AND
STORAGE MEDIUM
BACKGROUND OF THE INVENTION
Technical Field [0001] The present application relates to the field of text processing technology, and more particularly to a text intent identifying method, and corresponding device, computer equipment, and storage medium.
Description of Related Art
[0002] Generally speaking, the corresponding intent of text content can be determined according to the text content. As regards intent identification of text contents, it is usual to employ the method of classification to classify sentences into corresponding intent types.
NLU (Natural Language Processing) is mainly responsible for extracting contents to be understood in text contents. In the field of NLU, the traditional mode is to employ an algorithm to extract intents of text contents. Specifically, marked corpora of a uniform format are input in an algorithm, and intents of text contents are determined through degrees of confidence or classification results output from a comparison algorithm.
However, there is often the problem concerning insufficient marked data during the specific development process. In other words, in the case marked corpora are insufficient, when the marked corpora are used to train the algorithm for determining intents of text contents, and when the intents of text contents are finally determined according to the trained algorithm, the finally identified intents of text contents will be relatively low in precision due to the insufficient marked corpora.
SUMMARY OF THE INVENTION

Date Regue/Date Received 2022-09-06
[0003] In view of the above technical problems, there is an urgent need to provide a text intent identifying method, and corresponding device, computer equipment and storage medium capable of enhancing precision of text content intent identification.
[0004] There is provided a text intent identifying method that comprises:
obtaining a text to be processed; inputting the text to be processed into a text classification model, and obtaining a similar corpus of the text to be processed output from the text classification model and a first similarity between the similar corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents;
determining a first candidate intent of the text to be processed according to the similar corpus;
extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity information; obtaining a second similarity between the entity information and the text to be processed; and screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity.
[0005] In one of the embodiments, the step of extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity information is entered when the first similarity is greater than a first preset value and smaller than a second preset value, wherein the first preset value is smaller than the second preset value; the text intent identifying method further comprises:
taking the first candidate intent to serve as the final intent when the first similarity is greater than or equal to the second preset value; and/or generating reminder information when the first similarity is smaller than or equal to the first preset value.
[0006] In one of the embodiments, the step of extracting entity information of the text to be processed includes: obtaining plural preset term types, each of which is associated with a first preset intent; obtaining a word searching algorithm to which the various preset term types Date Regue/Date Received 2022-09-06 correspond, wherein the word searching algorithm is employed for searching terms to which the various preset term types correspond; extracting terms to which the various preset term types correspond from the text to be processed according to the word searching algorithm to which the various preset term types correspond, and obtaining plural first target terms of the text to be processed; and generating the entity information according to the plural first target terms.
[0007] In one of the embodiments, the step of obtaining a second candidate intent of the text to be processed according to the entity information includes: obtaining a preset intent set, wherein the preset intent set includes plural second preset intents, and the various second preset intents are associated with plural preset terms; obtaining plural first target terms in the entity information; and screening a target intent out of the preset intent set according to the plural first target terms and the preset terms associated with the various second preset intents in the preset intent set, and determining the second candidate intent according to the target intent.
[0008] In one of the embodiments, the step of screening a target intent out of the preset intent set according to the plural first target terms and the preset terms associated with the various second preset intents in the preset intent set includes: obtaining a preset keyword; performing term-matching between the preset keyword and the preset terms associated with the various second preset intents when the plural first target terms include the preset keyword, and screening a first target sub-candidate intent out of the plural second preset intents of the preset intent set according to a term-matching result, wherein the first target sub-candidate intent serves as the target intent; and performing term-matching between the plural first target terms respectively and the preset terms associated with the various second preset intents when the plural first target terms do not include the preset keyword, and screening a second target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the second target sub-candidate intent serves as the target intent.

Date Regue/Date Received 2022-09-06
[0009] In one of the embodiments, the step of obtaining a second similarity between the entity information and the text to be processed includes: obtaining a first sub-similarity between a first target term to which the target intent corresponds and the text to be processed; taking a first sub-similarity with the highest similarity in plural first sub-similarities to serve as the second similarity, when there are plural target intents, and therefore there are plural first sub-similarities; and taking the first sub-similarity to serve as the second similarity, when there is one target intent; the step of screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity includes: taking the first candidate intent to serve as the final intent of the text to be processed, when the first similarity is greater than or equal to the second similarity;
taking the target intent to which the second similarity corresponds to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include plural target intents; and taking the target intent in the second candidate intents to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include one target intent.
[0010] In one of the embodiments, the step of obtaining a second similarity between the entity information and the text to be processed includes: performing term-segmentation on the text to be processed, and obtaining plural second target terms of the text to be processed; obtaining a first number of the first target terms and a second number of the second target terms; and obtaining a ratio of the first number to the second number, and determining the second similarity according to the ratio.
[0011] There is provided a text intent identifying device that comprises: a first obtaining module, for obtaining a text to be processed; a second obtaining module, for inputting the text to be processed into a text classification model, and obtaining a similar corpus of the text to be processed output from the text classification model and a first similarity between the similar Date Regue/Date Received 2022-09-06 corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents; a first determining module, for determining a first candidate intent of the text to be processed according to the similar corpus; a third obtaining module, for extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity information; a fourth obtaining module, for obtaining a second similarity between the entity information and the text to be processed; and a second determining module, for screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity.
[0012] There is provided a computer equipment that comprises a memory, a processor, and a computer program stored on the memory and operable on the processor, and steps of the method according to any one of the above embodiments are realized when the processor executes the computer program.
[0013] There is provided a computer-readable storage medium that stores a computer program thereon, and steps of the method according to any one of the above embodiments are realized when the computer program is executed by a processor.
[0014] In the aforementioned text intent identifying method, and corresponding device, computer equipment and storage medium, a text to be processed is firstly input in a text classification model to obtain a first similarity between a similar corpus and the text to be processed. At the same time, a first intent of the text to be processed is determined according to the similar corpus. Entity information of the text to be processed is then extracted, and a second intent of the text to be processed is obtained according to the entity information. At the same time, a second similarity between the entity information and the text to be processed is obtained. Finally, an intent of the text to be processed is determined according to the first similarity and the second similarity, and the intent of the text to be processed is the first intent or the second intent. Accordingly, the first intent and the second intent of the text to be Date Regue/Date Received 2022-09-06 processed are respectively determined through the text classification model and the entity information of the text to be processed, and the final intent of the text to be processed is the first intent or the second intent according to the similarity between the text classification model and the text to be processed, so that many modes can be employed to identify the intent of the text to be processed, it is avoided to employ a single text classification model to determine the intent of the text to be processed whereby the precision in intent identification of the text to be processed is relatively low due to insufficient marked corpora, and precision in text content intent identification is enhanced.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] Fig. 1 is a view illustrating an application environment for a text intent identifying method in an embodiment;
[0016] Fig. 2 is a flowchart schematically illustrating a text intent identifying method in an embodiment;
[0017] Fig. 3 is a flowchart schematically illustrating a text intent identifying method in another embodiment;
[0018] Fig. 4 is a flowchart schematically illustrating step S108 in an embodiment;
[0019] Fig. 5 is a flowchart schematically illustrating step S108 in another embodiment;
[0020] Fig. 6 is a flowchart schematically illustrating step S1085 in an embodiment;
[0021] Fig. 7 is a flowchart schematically illustrating step S110 in an embodiment;
[0022] Fig. 8 is a block diagram illustrating the structure of a text intent identifying device in Date Regue/Date Received 2022-09-06 an embodiment; and
[0023] Fig. 9 is a view illustrating the internal structure of the computer equipment in an embodiment.
DETAILED DESCRIPTION OF THE INVENTION
[0024] In order to make more lucid and clear the objectives, technical solutions and advantages of the present application, the present application is described in greater detail below with reference to accompanying drawings and embodiments. As should be understood, the specific embodiments as described here are merely meant to explain the present application, rather than to restrict the present application.
[0025] The text intent identifying method provided by the present application is applied to the application environment as shown in Fig. 1. The user can carry out data interaction with the corresponding service platform through various applications on the terminal.
Particularly, the user can send a text of a Q&A type to the corresponding service platform through an application on the terminal to receive reply information dispatched by the service platform.
The client server supports the server of the service platform. The service platform receives through the client server the text of the Q&A type, namely a text to be processed, sent by the user. Further, the text to be processed is input in a text classification model to obtain a similar corpus of the text to be processed output from the text classification model and a first similarity between the similar corpus and the text to be processed. At the same time, a first candidate intent of the text to be processed is determined according to the similar corpus. In addition, the service platform extracts entity information of the text to be processed, obtains a second candidate intent of the text to be processed according to the entity information, and obtains a second similarity between the entity information and the text to be processed.
Finally, a final intent of the text to be processed is screened out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity. The Date Regue/Date Received 2022-09-06 final intent is precisely the intent to which the text of the Q&A type sent by the user corresponds. Hence, the service platform reads a corresponding reply answer according to the obtained intent and dispatches the reply answer to the terminal of the user. The terminal here can be such a hardware equipment as a computer, a panel computer, a smart mobile phone, and the like. The client server can be embodied as a single server or a server cluster consisting of a plurality of servers.
[0026] In one embodiment, as shown in Fig. 2, there is provided a text intent identifying method, and the method is described with an example being applied to the service platform (specifically, a client server that supports the service platform) in Fig. 1, and the method comprises the following steps.
[0027] S102 ¨ obtaining a text to be processed.
[0028] In this embodiment, the user sends text information of a Q&A type to the service platform through the terminal. On reception of the text information of a Q&A
type sent from the user, the service platform takes the text information as a text to be processed. The text to be processed is employed to characterize a user intent, and the user intent can be obtained by performing intent identification on the text to be processed. For instance, the text to be processed can be such a text expressing the consultative intent of the user as "a return request has been submitted", "the mobile phone I bought does not work", or "where is my goods".
[0029] S104 - inputting the text to be processed into a text classification model, and obtaining a similar corpus of the text to be processed output from the text classification model and a first similarity between the similar corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents.
[0030] In this embodiment, having obtained the text to be processed, the service platform inputs the text to be processed in the text classification model. The text classification model has Date Regue/Date Received 2022-09-06 already been trained by means of corpora with marked intents. The text classification model is employed for identifying the text to be processed according to the corpora with marked intents, and outputting a candidate similar corpus similar to the text to be processed and a similarity between the candidate similar corpus and the text to be processed.
The candidate similar corpus can be one or more. Correspondingly, the similarity between the candidate similar corpus/corpora and the text to be processed can also be one or more.
When there are plural candidate similar corpora, the candidate similar corpus with the highest similarity is selected to serve as the similar corpus of the text to be processed, and the highest similarity is the first similarity between the similar corpus and the text to be processed. The text classification model can be a Text-CNN model (text convolution model). When the text classification model is being trained, the model can be trained after the operation of removing stopwords from Q&A corpora with marked sentence dimensions (corpora with marked intents). For stance, useless words like modal auxiliaries such as ma, la, ni (all of which are Chinese mood particles ¨ translator's note) are removed. Moreover, before the text to be processed is input in the trained text classification model, any stopword is removed from the text to be processed, and the text to be processed removed of any stopword is then input in the trained text classification model to obtain a similar corpus of the text to be processed and a first similarity between the similar corpus and the text to be processed.
Accordingly, the processing efficiency of the service platform can be enhanced.
[0031] S106 - determining a first candidate intent of the text to be processed according to the similar corpus.
[0032] In this embodiment, when the service platform determines the similar corpus of the text to be processed according to the text classification model, a user intent to which the similar corpus corresponds is obtained, and the user intent is taken to serve as the first candidate intent of the text to be processed. Specifically, the service platform stores plural pieces of corpora with marked intents, after the text classification model has been trained by means of the corpora with marked intents, the similar corpus output when the text to be processed is Date Regue/Date Received 2022-09-06 input in the text classification model has already been marked with intent, and the first candidate intent of the text to be processed can be determined according to the marked intent.
Alternatively, having obtained the similar corpus, the service platform obtains a corresponding standard corpus according to the similar corpus, and hence determines the first candidate intent of the text to be processed according to the standard corpus.
The standard corpus has been marked with intent. The first candidate intent of the text to be processed can be determined according to a standard question.
[0033] For example, the service platform stores Q&A corpora with marked sentence dimensions (with marked intents), such as corpora of the after-sale type. As regards the corpora of the after-sale type, their standard questions (Q&A corpora with marked sentence dimensions) and similar questions are as shown below:
[0034] {
[0035] "intent": "consulting return and exchange time (urging for return)",
[0036] "text": "a return request has been submitted"
[0037] 1,
[0038] {
[0039] "intent": "consulting return and exchange time (urging for return)",
[0040] "text": "I have submitted a return request"
[0041] 1,
[0042] {
[0043] "intent": "applying for return",
[0044] "text": "Return goods"
[0045] 1;
[0046] wherein the intent field corresponds to a standard question, and the text field corresponds to a similar question. There is a one-to-many corresponding relation between a standard question and similar questions. After having obtained a similar question with the highest similarity, the user obtains the final result by searching for the answer to the corresponding standard question.
Date Regue/Date Received 2022-09-06
[0047] S108 - extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity information.
[0048] In this embodiment, the service platform extracts the entity information of the text to be processed. The entity information can be information made up of segmented terms in the text to be processed. For instance, the entity information contains category words, brand words, hot words and keywords, etc. The entity information can further be determined according to the text content of the text to be processed. For instance, the semantics of the text to be processed is determined based on the text content of the text to be processed, and the semantics of the text to be processed is taken to serve as the entity information.
[0049] Moreover, the service platform obtains the second candidate intent of the text to be processed according to the entity information. Specifically, the service platform contains therein plural preset intents, and the various preset intents correspond to associated information. The second candidate intent of the text to be processed can be determined according to a matching relation between the entity information and the associated information of the various preset intents.
[0050] 5110 - obtaining a second similarity between the entity information and the text to be processed.
[0051] In this embodiment, the second similarity can be a similarity between the semantics of the entity information and the text to be processed. When the entity information consists of one or more segmented term(s) extracted from the text to be processed, the second similarity can be further determined according to the proportion of the one or more segmented term(s) to the text to be processed. The second similarity characterizes the similar degree between the entity information and the text to be processed.

Date Regue/Date Received 2022-09-06
[0052] S112 - screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity.
[0053] In this embodiment, the service platform determines the first similarity and the first candidate intent of the text to be processed according to the text classification model, determines the second similarity and the second candidate intent of the text to be processed according to the entity information of the text to be processed, and hence screens a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity. When the first similarity is greater than or equal to the second similarity, the final intent is the first candidate intent. When the first similarity is smaller than the second similarity, the final intent is the second candidate intent. Accordingly, by comparing the similarities obtained through the two modes, the candidate intent to which the highest similarity corresponds is taken to serve as the final intent of the text to be processed, so that the finally determined intent of the text to be processed is more precise, and the low precision in intent identification caused by determining the intent of the text to be processed through a single mode is avoided.
[0054] In the aforementioned text intent identifying method, a text to be processed is firstly input in a text classification model to obtain a similar corpus and a first similarity between the similar corpus and the text to be processed. At the same time, a first intent of the text to be processed is determined according to the similar corpus. Entity information of the text to be processed is then extracted, and a second intent of the text to be processed is obtained according to the entity information. At the same time, a second similarity between the entity information and the text to be processed is obtained. Finally, an intent of the text to be processed is determined according to the first similarity and the second similarity, and the intent of the text to be processed is the first intent or the second intent.
Accordingly, the first intent and the second intent of the text to be processed are respectively determined through the text classification model and the entity information of the text to be processed, and the final intent of the text to be processed is the first intent or the second intent according to the Date Regue/Date Received 2022-09-06 similarity between the text classification model and the text to be processed, so that many modes can be employed to identify the intent of the text to be processed, it is avoided to employ a single text classification model to determine the intent of the text to be processed whereby the precision in intent identification of the text to be processed is relatively low due to insufficient marked corpora, and precision in text content intent identification is enhanced.
[0055] In one embodiment, as shown in Fig. 3, before step S108 is entered, the service platform sets a precondition. The precondition is that the first similarity is greater than a first preset value and smaller than a second preset value, and that the first preset value is smaller than the second preset value. When the first similarity is greater than the first preset value and smaller than the second preset value, step S108 is entered. When the precondition is not satisfied, there are two circumstances. Circumstance 1, with reference to step S1074: when the first similarity is greater than or equal to the second preset value, the first candidate intent is taken to serve as the final intent. Circumstance 2, with reference to step S1072: when the first similarity is smaller than or equal to the first preset value, reminder information is generated.
[0056] Specifically, after the text to be processed removed of any stopword has been subjected to classification identification by means of the trained text classification model, a candidate similar corpus and a similarity between the candidate similar corpus and the text to be processed output from the model are obtained. There are plural pieces of candidate similar corpora, there are also plural similarities between the candidate similar corpora and the text to be processed, and the candidate similar corpora are sorted according to size of the similarities. Moreover, the service platform obtains the candidate similar corpus with the highest similarity, if the similarity to which the candidate similar corpus with the highest similarity corresponds is greater than or equal to the second preset value (95%, for example), the intent to which the candidate similar corpus corresponds is directly taken to serve as the first candidate intent, and the process terminates at this time, without having to execute step S108. If the similarity to which the candidate similar corpus with the highest similarity Date Regue/Date Received 2022-09-06 corresponds is greater than the first preset value (60%, for example) and smaller than the second preset value, step S108 is executed. If the similarity to which the candidate similar corpus with the highest similarity corresponds is smaller than or equal to the first preset value, reminder information is generated, and it is also not needed to execute step S108 at this time.
Accordingly, the capability of the service platform for intent identification of the text to be processed can be enhanced.
[0057] In one embodiment, as shown in Fig. 4, step S108 includes:
[0058] S1082 - obtaining plural preset term types, each of which is associated with a first preset intent;
[0059] S1084 - obtaining a word searching algorithm to which the various preset term types correspond, wherein the word searching algorithm is employed for searching terms to which the various preset term types correspond;
[0060] S1086 - extracting terms to which the various preset term types correspond from the text to be processed according to the word searching algorithm to which the various preset term types correspond, and obtaining plural first target terms of the text to be processed; and
[0061] S1088 - generating the entity information according to the plural first target terms.
[0062] In this embodiment, the service platform presets therein plural preset term types, each of which is associated with a first preset intent. For instance, the plural term types include category words, hot words, brand words, and keywords. A category word corresponds to one or more first preset intent(s), a hot word corresponds to one or more first preset intent(s), a brand word corresponds to one or more first preset intent(s), and a keyword corresponds to one or more first preset intent(s). In addition, the word searching algorithm to which the various preset term types correspond is employed for searching terms to which the various preset term types correspond. The service platform extracts terms to which the various preset term types correspond from the text to be processed according to the word searching algorithm to which the various preset term types correspond, and obtains plural first target terms of the text to be processed. The word searching algorithm to which the various preset Date Regue/Date Received 2022-09-06 term types correspond can be the same and single word searching algorithm. The word searching algorithm can be a dictionary tree searching algorithm. Finally, the entity information is generated according to the plural first target terms. The entity information can include plural first target terms, and can also be any other information generated according to plural first target terms and containing no first target term. Accordingly, the capability of the service platform for extracting entity information of the text to be processed can be enhanced. For instance, in the specific process of generating the entity information, term segmentation is performed on the text to be processed, and the result after term segmentation is performed with NER (naming object entity) pickup by means of word dimension corpora to obtain the entity information in the text to be processed. The entity information can include categories, brands, hot words, and keywords, etc.
[0063] In one embodiment, as shown in Fig. 5, step S108 further includes:
[0064] S1081 - obtaining a preset intent set, wherein the preset intent set includes plural second preset intents, and the various second preset intents are associated with plural preset terms;
[0065] S1083 - obtaining plural first target terms in the entity information;
and
[0066] S1085 - screening a target intent out of the preset intent set according to the plural first target terms and the preset terms associated with the various second preset intents in the preset intent set, and determining the second candidate intent according to the target intent.
[0067] In this embodiment, a preset intent set was previously set in the service platform. The preset intent set includes plural second preset intents, and the various second preset intents are associated with plural preset terms. For instance, when a second preset intent is a purchasing intent, preset terms associated therewith can include "buy", "purchase" and "sell", etc. When a second preset intent is an after-sale intent, preset terms associated therewith can include "sell" and "having been damaged", etc. Through the association relations between the preset terms and the second preset intents, it is possible to screen a target intent out of the preset intent set according to the plural first target terms and the preset terms associated with the various second preset intents. There can be one or more target intent(s).
The service Date Regue/Date Received 2022-09-06 platform can determine the second candidate intent according to the target intent. Therefore, by screening a target intent out of the preset intent set according to plural first target terms in the entity information, and hence determining the second candidate intent according to the target intent, it is possible for the service platform to quickly obtain the second candidate intent.
[0068] In one embodiment, as shown in Fig. 6, step S1085 includes:
[0069] S10852 - obtaining a preset keyword;
[0070] S10854 - performing term-matching between the preset keyword and the preset terms associated with the various second preset intents when the plural first target terms include the preset keyword, and screening a first target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the first target sub-candidate intent serves as the target intent; and
[0071] S10856 - performing term-matching between the plural first target terms respectively and the preset terms associated with the various second preset intents when the plural first target terms do not include the preset keyword, and screening a second target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the second target sub-candidate intent serves as the target intent.
[0072] In this embodiment, preset keywords are set in the service platform.
The preset keyword can be set according to the intent of the current activity or according to the user intent identifiable by the system. The user intent can be directly identified according to the preset keyword. Moreover, plural first target terms are extracted from the text to be processed, and matching identification is performed on the preset keyword and the plural first target terms to judge whether the plural first target terms contain the preset keyword. If yes, the preset keyword is term-matched with the preset terms associated with the various second preset intents, a first target sub-candidate intent is screened out of the plural second preset intents of the preset intent set according to the term matching result, and the first target sub-candidate intent serves as the target intent. Accordingly, it is unnecessary to perform term-matching on Date Regue/Date Received 2022-09-06 all the first target terms with the preset terms associated with the various second preset intents, thus dispensing with some computational work of the service platform, and enhancing the efficiency of the service platform for intent identification of the text to be processed. If the plural first target terms do not contain the preset keyword, the plural first target terms are respectively term-matched with the preset terms associated with the second preset intents, a second target sub-candidate intent is screened out of the plural second preset intents of the preset intent set according to the term matching result, and the second target sub-candidate intent serves as the target intent. When the first target terms are term-matched with the preset terms associated with the second preset intents to screen out the second target sub-candidate intent, the first target terms can correspond to one or more second target sub-candidate intent(s).
[0073] A specific implementation scenario is given below with respect to obtaining the target intent according to the preset keyword.
[0074] The method of filtering keywords is employed to preliminarily screen the intents of the text to be processed. Specifically, a subset of an intent list supported by the customer service system is obtained, and suppose that the current system supports the four intents: after-sale, shopping guide, activities query, and special-offer coupons query. Two keywords of "buy"
and "does not work" are respectively obtained through the mode of obtaining the keyword type NER (naming object entity) by filtering the text to be processed "the mobile phone I
bought does not work", and two intents corresponding to shopping guide and after-sale are hence respectively obtained, thusly it is possible to merely compare the similarity between the two intents of shopping guide and after-sale in the subsequent cosine similarity calculation, whereby some additional computational work is dispensed with.
[0075] In one embodiment, it is further possible to eliminate preset eliminated words from the plural first target terms when the plural first target terms contain preset eliminated words to obtain plural object terms. The object terms are term-matched with the preset terms associated Date Regue/Date Received 2022-09-06 with the various second preset intents, a first target sub-candidate intent is screened out of the plural second preset intents of the preset intent set according to the term matching result, and the first target sub-candidate intent serves as the target intent.
[0076] In this embodiment, plural preset eliminated words can be preset in the service platform for screening terms with respect to the plural first target terms. When the plural first target terms contain the preset eliminated words, the preset eliminated words in the plural first target terms are eliminated, the first target terms that remain are term-matched with the preset terms associated with the various second preset intents, and a first target sub-candidate intent is finally screened out of the plural second preset intents of the preset intent set according to the term matching result.
[0077] In one embodiment, as shown in Fig. 7, step S110 includes:
[0078] S1102 - obtaining a first sub-similarity between a first target term to which the target intent corresponds and the text to be processed;
[0079] S1104 - taking a first sub-similarity with the highest similarity in plural first sub-similarities to serve as the second similarity, when there are plural target intents, and therefore there are plural first sub-similarities; and
[0080] S1106 - taking the first sub-similarity to serve as the second similarity, when there is one target intent.
[0081] At this time, step S112 includes:
[0082] S1122 - taking the first candidate intent to serve as the final intent of the text to be processed, when the first similarity is greater than or equal to the second similarity;
[0083] S1124 - taking the target intent to which the second similarity corresponds to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include plural target intents; and
[0084] S1126 - taking a target intent in the second candidate intents to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and Date Regue/Date Received 2022-09-06 the second candidate intents include one target intent.
[0085] In this embodiment, when there are plural target intents determined according to plural first target terms, there are also plural first sub-similarities to which the plural target intents correspond. At this time, the first sub-similarity with the highest similarity in the plural first sub-similarities is taken to serve as the second similarity. The target intent to which the second similarity corresponds is taken to serve to as the second candidate intent at this time. When there is one target intent determined according to plural first target terms, it is not needed to make any screening at this time, and the first sub-similarity to which this target intent corresponds is directly taken to serve as the second similarity, and this target intent is precisely the second candidate intent. Accordingly, when the final intent of the text to be processed is screened in step S112, if the first similarity is greater than or equal to the second similarity, the first candidate intent is directly taken to serve as the final intent of the text to be processed at this time. If the first similarity is smaller than the second similarity and the second candidate intents include plural target intents, the target intent to which the second similarity corresponds is taken to serve as the final intent of the text to be processed. When the first similarity is smaller than the second similarity and the second candidate intents include one target intent, the target intent in the second candidate intents is taken to serve as the final intent of the text to be processed. Accordingly, the service platform can supply approaches for intent identification of the text to be processed under many circumstances, and the capability for intent identification of the text to be processed is enhanced.
[0086] In one embodiment, step 5110 includes: performing term-segmentation on the text to be processed, and obtaining plural second target terms of the text to be processed; obtaining a first number of the first target terms and a second number of the second target terms; and obtaining a ratio of the first number to the second number, and determining the second similarity according to the ratio.
[0087] In this embodiment, when the second similarity between the entity information and the Date Regue/Date Received 2022-09-06 text to be processed is obtained, term-segmentation is performed on the text to be processed to obtain plural second segmented terms. Moreover, a second number of second segmented terms of the text to be processed is obtained, a first number of first segmented terms in the entity information is obtained, and a ratio of the first number to the second number is obtained.
This ratio is taken to serve as the second similarity. For instance, the text to be processed reads "the mobile phone I bought does not work", the entity information concerns "bought"
and "not work", then the similarity between the two is (215)*100%=40%.
[0088] A specific embodiment is provided below with respect to the text intent identifying method recited in the foregoing embodiments, and the text to be processed "the mobile phone I bought does not work" is taken as an example.
[0089] Firstly, corpora of the after-sale type with sentence dimensions marked are deeply trained with a Text-CNN model, the corresponding data model is stored, and the corpora of the after-sale type include the piece of corpus, "the air-conditioner I just bought does not work".
[0090] Secondarily, according to the word dimension marking result in the system, words of different types are trained by means of a TriTree (dictionary tree) algorithm, and corresponding models are respectively stored. Included therein are such keywords with particularly apparent intention tendencies as buy, not work, activity, etc., and such corresponding category words as mobile phone, telephone set, refrigerator, air-conditioner, etc., for NER pickup of the text to be processed.
[0091] Moreover, a similarity algorithm of the corresponding intent is designed. For instance, a similarity algorithm of the purchasing intent can convert "the mobile phone I bought does not work" to a format of word vectors by the mode of calculating cosine similarity of word vectors: "bought (a keyword)" and "mobile phone (category words)" are compared with word vectors "I", "just bought", "mobile phone", and "not work" of the text to be processed Date Regue/Date Received 2022-09-06 removed of any stopword, and it can be derived that the similarity of the text to be processed is 53% under the purchasing intent.
[0092] Finally, the text to be processed is predicted by means of the Text-CNN
model obtained via the corpora with marked sentence dimensions to derive a similarity of after-sale intent as 80%, it is therefore possible to derive that the intent of the text to be processed is an after-sale intent, the similar question is: the air-conditioner I just bought does not work, and the point of knowledge to which the similar question corresponds is: after-sale maintenance.
[0093] Through comparison of similarities of the two, it is derived that the intent of the text to be processed is after-sale maintenance.
[0094] Therefore, the present application has solved the difficult problem of obtaining the final intent of the user by enabling corpora with marked word dimensions and corpora with marked sentence dimensions to simultaneously exert functions in the case the corpora with marked sentence dimensions are insufficient, so as to avoid the problem that precision of user intent identification is low when the corpora with marked sentence dimensions are insufficient.
[0095] As should be understood, although the various steps in the flowcharts are sequentially displayed as indicated by arrows, these steps are not necessarily executed in the sequences indicated by arrows. Unless explicitly noted in this paper, execution of these steps is not restricted by any sequence, as these steps can also be executed in other sequences (other than those indicated in the drawings). Moreover, at least partial steps in the flowcharts may include plural sub-steps or multi-phases, these sub-steps or phases are not necessarily completed at the same timing, but can be executed at different timings, and these sub-steps or phases are also not necessarily sequentially performed, but can be performed in turns or alternately with other steps or with at least some of sub-steps or phases of other steps.
[0096] The present application further provides a text intent identifying device, as shown in Fig.

Date Regue/Date Received 2022-09-06 8, the device comprises a first obtaining module 10, a second obtaining module 20, a first determining module 30, a third obtaining module 40, a fourth obtaining module 50, and a second determining module 60.
[0097] The first obtaining module 10 is employed for obtaining a text to be processed; the second obtaining module 20 is employed for inputting the text to be processed into a text classification model, and obtaining a similar corpus of the text to be processed output from the text classification model and a first similarity between the similar corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents; the first determining module 30 is employed for determining a first candidate intent of the text to be processed according to the similar corpus; the third obtaining module 40 is employed for extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity information; the fourth obtaining module 50 is employed for obtaining a second similarity between the entity information and the text to be processed; and the second determining module 60 is employed for screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity.
[0098] In one of the embodiments, when the first similarity is greater than a first preset value and smaller than a second preset value, the extracting operation of the third obtaining module 40 is realized, wherein the first preset value is smaller than the second preset value; the text intent identifying device further comprises (not shown in Fig. 8): a third determining module for taking the first candidate intent to serve as the final intent when the first similarity is greater than or equal to the second preset value; and/or generating reminder information when the first similarity is smaller than or equal to the first preset value.
[0099] In one of the embodiments, the third obtaining module 40 includes (not shown in Fig.
8): a first obtaining unit, for obtaining plural preset term types, each of which is associated with a first preset intent; a second obtaining unit, for obtaining a word searching algorithm Date Regue/Date Received 2022-09-06 to which the various preset term types correspond, wherein the word searching algorithm is employed for searching terms to which the various preset term types correspond; an extracting unit, for extracting terms to which the various preset term types correspond from the text to be processed according to the word searching algorithm to which the various preset term types correspond, and obtaining plural first target terms of the text to be processed; and a generating unit, for generating the entity information according to the plural first target terms.
[0100] In one of the embodiments, the third obtaining module 40 includes (not shown in Fig.
8): a third obtaining unit, for obtaining a preset intent set, wherein the preset intent set includes plural second preset intents, and the various second preset intents are associated with plural preset terms; a fourth obtaining unit, for obtaining plural first target terms in the entity information; and a screening unit, for screening a target intent out of the preset intent set according to the plural first target terms and the preset terms associated with the various second preset intents in the preset intent set, and determining the second candidate intent according to the target intent.
[0101] In one of the embodiments, the screening unit includes: a first obtaining subunit, for obtaining a preset keyword; a first screening subunit, for performing term-matching between the preset keyword and the preset terms associated with the various second preset intents when the plural first target terms include the preset keyword, and screening a first target sub-candidate intent out of the plural second preset intents of the preset intent set according to a term-matching result, wherein the first target sub-candidate intent serves as the target intent;
and a second screening subunit, for performing term-matching on the plural first target terms respectively with the preset terms associated with the various second preset intents when the plural first target terms do not include the preset keyword, and screening a second target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the second target sub-candidate intent serves as the target intent.

Date Regue/Date Received 2022-09-06
[0102] In one of the embodiments, the fourth obtaining module 50 includes (not shown in Fig.
8): a fifth obtaining unit, for obtaining a first sub-similarity between a first target term to which the target intent corresponds and the text to be processed; a first determining unit, for taking a first sub-similarity with the highest similarity in plural first sub-similarities to serve as the second similarity, when there are plural target intents, and therefore there are plural first sub-similarities; and a second determining unit, for taking the first sub-similarity to serve as the second similarity, when there is one target intent; the second determining module 60 includes: a third determining unit, for taking the first candidate intent to serve as the final intent of the text to be processed, when the first similarity is greater than or equal to the second similarity; a fourth determining unit, for taking the target intent to which the second similarity corresponds to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include plural target intents; and a fifth determining unit, for taking a target intent in the second candidate intents to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include one target intent.
[0103] In one of the embodiments, the fourth obtaining module 50 includes (not shown in Fig.
8): a term segmenting unit, for performing term-segmentation on the text to be processed, and obtaining plural second target terms of the text to be processed; a sixth obtaining unit, for obtaining a first number of the first target terms and a second number of the second target terms; and a sixth determining unit, for obtaining a ratio of the first number to the second number, and determining the second similarity according to the ratio.
[0104] Specific definitions relevant to the text intent identifying device may be inferred from the aforementioned definitions to the text intent identifying method, while no repetition is made in this context. The various modules in the aforementioned text intent identifying device can be wholly or partly realized via software, hardware, and a combination of software with hardware. The various modules can be embedded in or independent of a processor in a Date Regue/Date Received 2022-09-06 computer equipment in the form of hardware, and can also be stored in the form of software in a memory in a computer equipment, so as to facilitate the processor to invoke and perform operations corresponding to the aforementioned various modules.
[0105] In one embodiment, a computer equipment is provided, the computer equipment can be a client server supporting running on a service platform, and its internal structure can be as shown in Fig. 9. The computer equipment comprises a processor, a memory, a network interface and a database connected to each other via a system bus. The processor of the computer equipment is employed to provide computing and controlling capabilities. The memory of the computer equipment includes a nonvolatile storage medium and an internal memory. The nonvolatile storage medium stores an operating system, a computer program and a database. The internal memory provides environment for the running of the operating system and the computer program in the nonvolatile storage medium. The network interface of the computer equipment is employed to connect to an external terminal to read texts to be processed on the terminal. The computer program realizes a method of locating an interface element when it is executed by a processor.
[0106] As understandable to persons skilled in the art, the structure illustrated in Fig. 9 is merely a block diagram of partial structure relevant to the solution of the present application, and does not constitute any restriction to the computer equipment on which the solution of the present application is applied, as the specific computer equipment may comprise component parts that are more than or less than those illustrated in Fig. 9, or may combine certain component parts, or may have different layout of component parts.
[0107] In one embodiment, there is provided a computer equipment that comprises a memory, a processor and a computer program stored on the memory and operable on the processor, and the following steps are realized when the processor executes the computer program:
[0108] obtaining a text to be processed; inputting the text to be processed into a text classification model, and obtaining a similar corpus of the text to be processed output from Date Regue/Date Received 2022-09-06 the text classification model and a first similarity between the similar corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents; determining a first candidate intent of the text to be processed according to the similar corpus; extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity information;
obtaining a second similarity between the entity information and the text to be processed; and screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity.
[0109] In one of the embodiments, the processor executes the computer program to realize the step of extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity information when the first similarity is greater than a first preset value and smaller than a second preset value, wherein the first preset value is smaller than the second preset value; at this time, the following steps are further realized when the processor executes the computer program: taking the first candidate intent to serve as the final intent when the first similarity is greater than or equal to the second preset value; and/or generating reminder information when the first similarity is smaller than or equal to the first preset value.
[0110] In one of the embodiments, when the processor executes the computer program to realize the step of extracting entity information of the text to be processed, the following steps are specifically realized: obtaining plural preset term types, each of which is associated with a first preset intent; obtaining a word searching algorithm to which the various preset term types correspond, wherein the word searching algorithm is employed for searching terms to which the various preset term types correspond; extracting terms to which the various preset term types correspond from the text to be processed according to the word searching algorithm to which the various preset term types correspond, and obtaining plural first target terms of the text to be processed; and generating the entity information according to the plural first target terms.

Date Regue/Date Received 2022-09-06
[0111] In one of the embodiments, when the processor executes the computer program to realize the step of obtaining a second candidate intent of the text to be processed according to the entity information, the following steps are specifically realized:
obtaining a preset intent set, wherein the preset intent set includes plural second preset intents, and the various second preset intents are associated with plural preset terms; obtaining plural first target terms in the entity information; and screening a target intent out of the preset intent set according to the plural first target terms and the preset terms associated with the various second preset intents in the preset intent set, and determining the second candidate intent according to the target intent.
[0112] In one of the embodiments, when the processor executes the computer program to realize the step of screening a target intent out of the preset intent set according to the plural first target terms and the preset terms associated with the various second preset intents in the preset intent set, the following steps are specifically realized: obtaining a preset keyword;
performing term-matching between the preset keyword and the preset terms associated with the various second preset intents when the plural first target terms include the preset keyword, and screening a first target sub-candidate intent out of the plural second preset intents of the preset intent set according to a term-matching result, wherein the first target sub-candidate intent serves as the target intent; and performing term-matching between the plural first target terms respectively and the preset terms associated with the various second preset intents when the plural first target terms do not include the preset keyword, and screening a second target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the second target sub-candidate intent serves as the target intent.
[0113] In one of the embodiments, when the processor executes the computer program to realize the step of obtaining a second similarity between the entity information and the text to be processed, the following steps are specifically realized: obtaining a first sub-similarity Date Regue/Date Received 2022-09-06 between a first target term to which the target intent corresponds and the text to be processed;
taking a first sub-similarity with the highest similarity in plural first sub-similarities to serve as the second similarity, when there are plural target intents, and therefore there are plural first sub-similarities; and taking the first sub-similarity to serve as the second similarity, when there is one target intent; when the processor executes the computer program to realize the step of screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity, the following steps are specifically realized: taking the first candidate intent to serve as the final intent of the text to be processed, when the first similarity is greater than or equal to the second similarity; taking the target intent to which the second similarity corresponds to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include plural target intents; and taking the target intent in the second candidate intents to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include one target intent.
[0114] In one of the embodiments, when the processor executes the computer program to realize the step of obtaining a second similarity between the entity information and the text to be processed, the following steps are specifically realized: performing term-segmentation on the text to be processed, and obtaining plural second target terms of the text to be processed;
obtaining a first number of the first target terms and a second number of the second target terms; and obtaining a ratio of the first number to the second number, and determining the second similarity according to the ratio.
[0115] In one embodiment, there is provided a computer-readable storage medium storing thereon a computer program, and the following steps are realized when the computer program is executed by a processor:
[0116] obtaining a text to be processed; inputting the text to be processed into a text classification model, and obtaining a similar corpus of the text to be processed as output from Date Regue/Date Received 2022-09-06 the text classification model and a first similarity between the similar corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents; determining a first candidate intent of the text to be processed according to the similar corpus; extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity information;
obtaining a second similarity between the entity information and the text to be processed; and screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity.
[0117] In one of the embodiments, the computer program is executed by a processor to realize the step of extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity information when the first similarity is greater than a first preset value and smaller than a second preset value, wherein the first preset value is smaller than the second preset value; at this time, the following steps are further realized when the computer program is executed by a processor:
taking the first candidate intent to serve as the final intent when the first similarity is greater than or equal to the second preset value; and/or generating reminder information when the first similarity is smaller than or equal to the first preset value.
[0118] In one of the embodiments, when the computer program is executed by a processor to realize the step of extracting entity information of the text to be processed, the following steps are specifically realized: obtaining plural preset term types, each of which is associated with a first preset intent; obtaining a word searching algorithm to which the various preset term types correspond, wherein the word searching algorithm is employed for searching terms to which the various preset term types correspond; extracting terms to which the various preset term types correspond from the text to be processed according to the word searching algorithm to which the various preset term types correspond, and obtaining plural first target terms of the text to be processed; and generating the entity information according to the plural first target terms.

Date Regue/Date Received 2022-09-06
[0119] In one of the embodiments, when the computer program is executed by a processor to realize the step of obtaining a second candidate intent of the text to be processed according to the entity information, the following steps are specifically realized:
obtaining a preset intent set, wherein the preset intent set includes plural second preset intents, and the various second preset intents are associated with plural preset terms; obtaining plural first target terms in the entity information; and screening a target intent out of the preset intent set according to the plural first target terms and the preset terms associated with the various second preset intents in the preset intent set, and determining the second candidate intent according to the target intent.
[0120] In one of the embodiments, when the computer program is executed by a processor to realize the step of screening a target intent out of the preset intent set according to the plural first target terms and the preset terms associated with the various second preset intents in the preset intent set, the following steps are specifically realized: obtaining a preset keyword;
performing term-matching between the preset keyword and the preset terms associated with the various second preset intents when the plural first target terms include the preset keyword, and screening a first target sub-candidate intent out of the plural second preset intents of the preset intent set according to a term-matching result, wherein the first target sub-candidate intent serves as the target intent; and performing term-matching between the plural first target terms respectively and the preset terms associated with the various second preset intents when the plural first target terms do not include the preset keyword, and screening a second target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the second target sub-candidate intent serves as the target intent.
[0121] In one of the embodiments, when the computer program is executed by a processor to realize the step of obtaining a second similarity between the entity information and the text to be processed, the following steps are specifically realized: obtaining a first sub-similarity Date Regue/Date Received 2022-09-06 between a first target term to which the target intent corresponds and the text to be processed;
taking a first sub-similarity with the highest similarity in plural first sub-similarities to serve as the second similarity, when there are plural target intents, and therefore there are plural first sub-similarities; and taking the first sub-similarity to serve as the second similarity, when there is one target intent; when the computer program is executed by a processor to realize the step of screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity, the following steps are specifically realized: taking the first candidate intent to serve as the final intent of the text to be processed, when the first similarity is greater than or equal to the second similarity; taking the target intent to which the second similarity corresponds to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include plural target intents; and taking the target intent in the second candidate intents to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include one target intent.
[0122] In one of the embodiments, when the computer program is executed by a processor to realize the step of obtaining a second similarity between the entity information and the text to be processed, the following steps are specifically realized: performing term-segmentation on the text to be processed, and obtaining plural second target terms of the text to be processed;
obtaining a first number of the first target terms and a second number of the second target terms; and obtaining a ratio of the first number to the second number, and determining the second similarity according to the ratio.
[0123] As comprehensible to persons ordinarily skilled in the art, the entire or partial flows in the methods according to the aforementioned embodiments can be completed by instructing relevant hardware via a computer program, the computer program can be stored in a nonvolatile computer-readable storage medium, and the computer program can include the flows as embodied in the aforementioned various methods when executed. Any reference to Date Regue/Date Received 2022-09-06 the memory, storage, database or other media used in the various embodiments provided by the present application can all include nonvolatile and/or volatile memory/memories. The nonvolatile memory can include a read-only memory (ROM), a programmable ROM
(PROM), an electrically programmable ROM (EPROM), an electrically erasable and programmable ROM (EEPROM) or a flash memory. The volatile memory can include a random access memory (RAM) or an external cache memory. To serve as explanation rather than restriction, the RAM is obtainable in many forms, such as static RAM
(SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct Rambus dynamic RAM (DRDRAM), and Rambus dynamic RAM (RDRAM), etc.
[0124] Technical features of the aforementioned embodiments are randomly combinable, while all possible combinations of the technical features in the aforementioned embodiments are not exhausted for the sake of brevity, but all these should be considered to fall within the scope recorded in the Description as long as such combinations of the technical features are not mutually contradictory.
[0125] The foregoing embodiments are merely directed to several modes of execution of the present application, and their descriptions are relatively specific and detailed, but they should not be hence misunderstood as restrictions to the inventive patent scope. As should be pointed out, persons with ordinary skill in the art may further make various modifications and improvements without departing from the conception of the present application, and all these should pertain to the protection scope of the present application.
Accordingly, the patent protection scope of the present application shall be based on the attached Claims.

Date Regue/Date Received 2022-09-06

Claims (145)

Claims:
1. A text intent identifying device, characterized in that the device comprises:
a first obtaining module, for obtaining a text to be processed;
a second obtaining module, for inputting the text to be processed into a text classification model, and obtaining a similar corpus of the text to be processed output from the text classification model and a first similarity between the similar corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents;
a first determining module, for determining a first candidate intent of the text to be processed according to the similar corpus;
a third obtaining module, for extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity inform ati on;
a fourth obtaining module, for obtaining a second similarity between the entity information and the text to be processed;
a second determining module, for screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity; and wherein, extracting entity information of the text to be processed comprises:
acquiring a plurality of preset term types, wherein each preset term type is associated with at least one first preset intent;
acquiring a word searching algorithm corresponding to each preset term type, wherein the word searching algorithm is used for searching words corresponding to each preset term type;

Date regue/Date received 2024-01-23 extracting words corresponding to the preset term types from the text to be processed according to the word searching algorithms corresponding to the preset term types to obtain a plurality of first target words of the text to be processed;
generating the entity information according to the plurality of first target words;
wherein obtaining the second candidate intent of the text to be processed according to the entity information includes:
acquiring a preset intent set, wherein the preset intent set comprises a plurality of second preset intents, and each second preset intent is associated with a plurality of preset words;
acquiring the plurality of first target words in the entity information;
screening out target intents from the preset intent set according to the first target words and preset words related to second preset intents in the preset intent set, and determining second candidate intents according to the target intents;
and wherein obtaining the second similarity between the entity information and the text to be processed includes:
performing word segmentation on the text to be processed to obtain a plurality of second target words of the text to be processed;
acquiring a first quantity of the first target words and a second quantity of the second target words; and acquiring the ratio of the first quantity to the second quantity, and determining the second similarity according to the ratio.
2.
The device of claim 1 where the text to be processed is a Q&A type text information sent by a user to the first obtaining module.

Date recue/Date received 2024-01-23
3. The device of claim 1 wherein the text classification model outputs at least one candidate similar corpus and at least one similarity between the at least one candidate similar corpus and the text to be processed, and wherein the candidate similar corpus with the highest similarity with the text to be processed is selected to serve as the similar corpus.
4. The device of claim 1 wherein the text classification model is a Text-CNN model.
5. The device of claim 1 wherein the text classification model is trained after removing stopwords from Q&A corpora with marked sentence dimensions.
6. The device of claim 5 wherein stopwords are removed from the text to be processed before the text to be processed is input into the text classification model.
7. The device of any one of claims 1 to 6 wherein the device stores plural pieces of corpora with marked intents.
8. The device of any one of claims 1 to 6 wherein the second obtaining module obtains a standard corpus according to the similar corpus and the first determining module determines the first candidate intent according to the standard corpus.
9. The device of claim 8 wherein the first candidate intent of the text to be processed is determined according to a standard question.
10. The device of any one of claims 1 to 9 wherein the entity information includes segmented terms of the text to be processed.
11. The device of any one of claims 1 to 10 wherein the entity information includes category words, brand words, hot words, and/or keywords.
12. The device of any one of claims 1 to 11 wherein the entity information includes semantics of the text to be processed.
Date recue/Date received 2024-01-23
13. The device of any one of claims 1 to 12 wherein the third obtaining module obtains plural preset intents which correspond to associated information and the second candidate intent is determined according to a matching relationship between the entity information and the associated information of the plural preset intents.
14. The device of any one of claims 1 to 13 wherein the second similarity is a similarity between semantics of the entity information and the text to be processed.
15. The device of any one of claims 1 to 14 wherein the second similarity is determined according to the proportion of segmented terms of the text to be processed.
16. The device of any one of claims 1 to 15 wherein when the first similarity is greater than or equal to the second similarity the final intent is the first candidate intent.
17. The device of any one of claims 1 to 15 wherein when the first similarity is smaller than the second similarity the final intent is the second candidate intent.
18. The device of any one of claims 1 to 17 wherein the device sets a precondition.
19. The device of claim 18 wherein the precondition is that the first similarity is greater than a first present value and smaller than a second preset value, and that the first preset value is smaller than the second preset value.
20. The device of claim 19 wherein when the precondition is not met because the first similarity is greater than or equal to the second preset value the first candidate intent is set as the final intent.
21. The device of claim 19 wherein when the precondition is not met because the first similarity is smaller than or equal to the second preset value reminder information is generated.
22. The device of claim 1 wherein the plural term types includes category words, hot words, brand words, and keywords, each of which correspond to at least one first preset intent.
23. The device of claim 1 wherein a word searching algorithm is employed for searching terms to which the preset term types correspond.

Date recue/Date received 2024-01-23
24. The device of claim 23 wherein the word searching algorithm is a single word searching algorithm.
25. The device of claim 23 wherein the word searching algorithm is a dictionary tree searching algorithm.
26. The device of claim 1 wherein term segmentation is performed with NER
pickup by means of word dimension corpora to obtain the entity information.
27. The device according to claim 1, wherein the third obtaining module is further configured for:
obtaining a preset keyword;
performing term-matching between the preset keyword and the preset terms associated with the various second preset intents when the plural first target terms include the preset keyword, and screening a first target sub-candidate intent out of the plural second preset intents of the preset intent set according to a term-matching result, wherein the first target sub-candidate intent serves as the target intent; and performing term-matching between the plural first target terms respectively and the preset terms associated with the various second preset intents when the plural first target terms do not include the preset keyword, and screening a second target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the second target sub-candidate intent serves as the target intent.
28. The device of claim 27 wherein preset eliminated words are eliminated from the plural first target terms to obtain the plural object terms.
29. The device according to claim 27, wherein the fourth obtaining module is further configured for:
obtaining a first sub-similarity between a first target term to which the target intent corresponds and the text to be processed;

Date recue/Date received 2024-01-23 taking a first sub-similarity with the highest similarity in plural first sub-similarities to serve as the second similarity, when there are plural target intents, and therefore there are plural first sub-similarities; and taking the first sub-similarity to serve as the second similarity, when there is one target intent;
and that the step of screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity includes:
taking the first candidate intent to serve as the final intent of the text to be processed, when the first similarity is greater than or equal to the second similarity;
taking the target intent to which the second similarity corresponds to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include plural target intents; and taking the target intent in the second candidate intents to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include one target intent.
30. A text intent identifying system, characterized in that the system comprises:
a first obtaining module, for obtaining a text to be processed;
a second obtaining module, for inputting the text to be processed into a text classification model, and obtaining a similar corpus of the text to be processed output from the text classification model and a first similarity between the similar corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents;
a first determining module, for determining a first candidate intent of the text to be processed according to the similar corpus;

Date recue/Date received 2024-01-23 a third obtaining module, for extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity inform ati on;
a fourth obtaining module, for obtaining a second similarity between the entity information and the text to be processed;
a second determining module, for screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity; and wherein, extracting entity information of the text to be processed comprises:
acquiring a plurality of preset term types, wherein each preset term type is associated with at least one first preset intent;
acquiring a word searching algorithm corresponding to each preset term type, wherein the word searching algorithm is used for searching words corresponding to each preset term type;
extracting words corresponding to the preset term types from the text to be processed according to the word searching algorithms corresponding to the preset term types to obtain a plurality of first target words of the text to be processed;
generating the entity information according to the plurality of first target words;
wherein obtaining the second candidate intent of the text to be processed according to the entity information includes:
acquiring a preset intent set, wherein the preset intent set comprises a plurality of second preset intents, and each second preset intent is associated with a plurality of preset words;
acquiring the plurality of first target words in the entity information;

Date recue/Date received 2024-01-23 screening out target intents from the preset intent set according to the first target words and preset words related to second preset intents in the preset intent set, and determining second candidate intents according to the target intents;
and wherein obtaining the second similarity between the entity information and the text to be processed includes:
performing word segmentation on the text to be processed to obtain a plurality of second target words of the text to be processed;
acquiring a first quantity of the first target words and a second quantity of the second target words; and acquiring the ratio of the first quantity to the second quantity, and determining the second similarity according to the ratio.
31. The system of claim 30 where the text to be processed is a Q&A type text information sent by a user to the first obtaining module.
32. The system of claim 30 wherein the text classification model outputs at least one candidate similar corpus and at least one similarity between the at least one candidate similar corpus and the text to be processed, and wherein the candidate similar corpus with the highest similarity with the text to be processed is selected to serve as the similar corpus.
33. The system of claim 30 wherein the text classification model is a Text-CNN
model.
34. The system of claim 30 wherein the text classification model is trained after removing stopwords from Q&A corpora with marked sentence dimensions.
35. The system of claim 34 wherein stopwords are removed from the text to be processed before the text to be processed is input into the text classification model.
36. The system of any one of claims 30 to 35 wherein the device stores plural pieces of corpora with marked intents.
Date recue/Date received 2024-01-23
37. The system of any one of claims 30 to 35 wherein the second obtaining module obtains a standard corpus according to the similar corpus and the first determining module determines the first candidate intent according to the standard corpus.
38. The system of claim 37 wherein the first candidate intent of the text to be processed is determined according to a standard question.
39. The system of any one of claims 30 to 38 wherein the entity information includes segmented terms of the text to be processed.
40. The system of any one of claims 30 to 39 wherein the entity information includes category words, brand words, hot words, and/or keywords.
41. The system of any one of claims 30 to 40 wherein the entity information includes semantics of the text to be processed.
42. The system of any one of claims 30 to 41 wherein the third obtaining module obtains plural preset intents which correspond to associated information and the second candidate intent is determined according to a matching relationship between the entity information and the associated information of the plural preset intents.
43. The system of any one of claims 30 to 42 wherein the second similarity is a similarity between semantics of the entity information and the text to be processed.
44. The system of any one of claims 30 to 43 wherein the second similarity is determined according to the proportion of segmented terms of the text to be processed.
45. The system of any one of claims 30 to 44 wherein when the first similarity is greater than or equal to the second similarity the final intent is the first candidate intent.
46. The system of any one of claims 30 to 44 wherein when the first similarity is smaller than the second similarity the final intent is the second candidate intent.
47. The system of any one of claims 30 to 46 wherein the device sets a precondition.

Date recue/Date received 2024-01-23
48. The system of claim 47 wherein the precondition is that the first similarity is greater than a first present value and smaller than a second preset value, and that the first preset value is smaller than the second preset value.
49. The system of claim 48 wherein when the precondition is not met because the first similarity is greater than or equal to the second preset value the first candidate intent is set as the final intent.
50. The system of claim 48 wherein when the precondition is not met because the first similarity is smaller than or equal to the second preset value reminder information is generated.
51. The system of claim 30 wherein the plural term types includes category words, hot words, brand words, and keywords, each of which correspond to at least one first preset intent.
52. The system of claim 30 wherein the word searching algorithm is employed for searching terms to which the preset term types correspond.
53. The system of claim 52 wherein the word searching algorithm is a single word searching algorithm.
54. The system of claim 52 wherein the word searching algorithm is a dictionary tree searching algorithm.
55. The system of claim 30 wherein term segmentation is performed with NER
pickup by means of word dimension corpora to obtain the entity information.
56. The system according to claim 30, wherein the third obtaining module is further configured for:
obtaining a preset keyword;

Date recue/Date received 2024-01-23 performing term-matching between the preset keyword and the preset terms associated with the various second preset intents when the plural first target terms include the preset keyword, and screening a first target sub-candidate intent out of the plural second preset intents of the preset intent set according to a term-matching result, wherein the first target sub-candidate intent serves as the target intent; and performing term-matching between the plural first target terms respectively and the preset terms associated with the various second preset intents when the plural first target terms do not include the preset keyword, and screening a second target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the second target sub-candidate intent serves as the target intent.
57. The system of claim 56 wherein preset eliminated words are eliminated from the plural first target terms to obtain the plural object terms.
58. The system according to claim 56, wherein the fourth obtaining module is further configured for:
obtaining a first sub-similarity between a first target term to which the target intent corresponds and the text to be processed;
taking a first sub-similarity with the highest similarity in plural first sub-similarities to serve as the second similarity, when there are plural target intents, and therefore there are plural first sub-similarities; and taking the first sub-similarity to serve as the second similarity, when there is one target intent;
and that the step of screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity includes:
taking the first candidate intent to serve as the final intent of the text to be processed, when the first similarity is greater than or equal to the second similarity;

Date recue/Date received 2024-01-23 taking the target intent to which the second similarity corresponds to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include plural target intents; and taking the target intent in the second candidate intents to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include one target intent.
59. A text intent identifying method comprising:
obtaining, at a service platform through a terminal, a text to be processed;
inputting the text to be processed into a text classification model, and obtaining a similar corpus of the text to be processed output from the text classification model and a first similarity between the similar corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents;
determining a first candidate intent of the text to be processed according to the similar corpus;
extracting entity information of the text to be processed, and obtaining a second candidate intent of the text to be processed according to the entity information;
obtaining a second similarity between the entity information and the text to be processed;
screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity; and wherein, extracting entity information of the text to be processed comprises:
acquiring a plurality of preset term types, wherein each preset term type is associated with at least one first preset intent;
acquiring a word searching algorithm corresponding to each preset term type, wherein the word searching algorithm is used for searching words corresponding to each preset term type;

Date recue/Date received 2024-01-23 extracting words corresponding to the preset term types from the text to be processed according to the word searching algorithms corresponding to the preset term types to obtain a plurality of first target words of the text to be processed;
generating the entity information according to the plurality of first target words;
wherein obtaining the second candidate intent of the text to be processed according to the entity information includes:
acquiring a preset intent set, wherein the preset intent set comprises a plurality of second preset intents, and each second preset intent is associated with a plurality of preset words;
acquiring the plurality of first target words in the entity information;
screening out target intents from the preset intent set according to the first target words and preset words related to second preset intents in the preset intent set, and determining second candidate intents according to the target intents;
and wherein obtaining the second similarity between the entity information and the text to be processed includes:
performing word segmentation on the text to be processed to obtain a plurality of second target words of the text to be processed;
acquiring a first quantity of the first target words and a second quantity of the second target words; and acquiring the ratio of the first quantity to the second quantity, and determining the second similarity according to the ratio.
60. The method of claim 59 where the text to be processed is a Q&A type text information sent by a user to the service platform through the terminal.
Date recue/Date received 2024-01-23
61. The method of claim 59 wherein the text classification model outputs at least one candidate similar corpus and at least one similarity between the at least one candidate similar corpus and the text to be processed, and wherein the candidate similar corpus with the highest similarity with the text to be processed is selected to serve as the similar corpus.
62. The method of claim 59 wherein the text classification model is a Text-CNN
model.
63. The method of claim 59 wherein the text classification model is trained after removing stopwords from Q&A corpora with marked sentence dimensions.
64. The method of claim 63 wherein stopwords are removed from the text to be processed before the text to be processed is input into the text classification model.
65. The method of any one of claims 59 to 64 wherein the service platform stores plural pieces of corpora with marked intents.
66. The method of any one of claims 59 to 64 wherein the service platform obtains a standard corpus according to the similar corpus and determines the first candidate intent according to the standard corpus.
67. The method of claim 66 wherein the first candidate intent of the text to be processed is determined according to a standard question.
68. The method of any one of claims 59 to 67 wherein the entity information includes segmented terms of the text to be processed.
69. The method of any one of claims 59 to 68 wherein the entity infollnation includes category words, brand words, hot words, and/or keywords.
70. The method of any one of claims 59 to 69 wherein the entity information includes semantics of the text to be processed.

Date recue/Date received 2024-01-23
71. The method of any one of claims 59 to 70 wherein the service platform contains plural preset intents which correspond to associated information and the second candidate intent is determined according to a matching relationship between the entity information and the associated information of the plural preset intents.
72. The method of any one of claims 59 to 71 wherein the second similarity is a similarity between semantics of the ennty informan on and the text to be processed.
73. The method of any one of claims 59 to 72 wherein the second similarity is determined according to the proportion of segmented terms of the text to be processed.
74. The method of any one of claims 59 to 73 wherein when the first similarity is greater than or equal to the second similarity the final intent is the first candidate intent.
75. The method of any one of claims 59 to 73 wherein when the first similarity is smaller than the second similarity the final intent is the second candidate intent.
76. The method of any one of claims 59 to 75 wherein the service platform sets a precondition.
77. The method of claim 76 wherein the precondition is that the first similarity is greater than a first present value and smaller than a second preset value, and that the first preset value is smaller than the second preset value.
78. The method of claim 77 wherein when the precondition is not met because the first similarity is greater than or equal to the second preset value the first candidate intent is set as the final intent.
79. The method of claim 77 wherein when the precondition is not met because the first similarity is smaller than or equal to the second preset value reminder information is generated.
80. The method of claim 59 wherein the plural term types includes category words, hot words, brand words, and keywords, each of which correspond to at least one first preset intent.
81. The method of claim 59 wherein the word searching algorithm is employed for searching terms to which the preset term types correspond.

Date recue/Date received 2024-01-23
82. The method of claim 81 wherein the word searching algorithm is a single word searching algorithm.
83. The method of claim 81 wherein the word searching algorithm is a dictionary tree searching algorithm.
84. The method of claim 59 wherein term segmentation is performed with NER
pickup by means of word dimension corpora to obtain the entity information.
85. The method according to claim 59, wherein the step of screening a target intent out of the preset intent set according to the plural first target terms and the preset terms associated with the various second preset intents in the preset intent set includes:
obtaining a preset keyword;
performing term-matching between the preset keyword and the preset terms associated with the various second preset intents when the plural first target terms include the preset keyword, and screening a first target sub-candidate intent out of the plural second preset intents of the preset intent set according to a term-matching result, wherein the first target sub-candidate intent serves as the target intent; and performing term-matching between the plural first target terms respectively and the preset terms associated with the various second preset intents when the plural first target terms do not include the preset keyword, and screening a second target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the second target sub-candidate intent serves as the target intent.
86. The method of claim 85 wherein preset eliminated words set in the service platform are eliminated from the plural first target terms to obtain the plural object terms.
87. The method according to claim 85, wherein the step of obtaining a second similarity between the entity information and the text to be processed includes:
obtaining a first sub-similarity between a first target term to which the target intent corresponds and the text to be processed;

Date recue/Date received 2024-01-23 taking a first sub-similarity with the highest similarity in plural first sub-similarities to serve as the second similarity, when there are plural target intents, and therefore there are plural first sub-similarities; and taking the first sub-similarity to serve as the second similarity, when there is one target intent;
and that the step of screening a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity includes:
taking the first candidate intent to serve as the final intent of the text to be processed, when the first similarity is greater than or equal to the second similarity;
taking the target intent to which the second similarity corresponds to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include plural target intents; and taking the target intent in the second candidate intents to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include one target intent.
88. A computer equipment comprising:
a computer readable physical memory;
a processor communicatively coupled to the memory, a computer program stored on the memory and operable on the processor, wherein the processor executes the computer program configured to:
obtain, at a service platform through a terminal, a text to be processed;

Date recue/Date received 2024-01-23 input the text to be processed into a text classification model, and obtain a similar corpus of the text to be processed output from the text classification model and a first similarity between the similar corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents;
determine a first candidate intent of the text to be processed according to the similar corpus;
extract entity information of the text to be processed, and obtain a second candidate intent of the text to be processed according to the entity information;
obtain a second similarity between the entity information and the text to be processed;
screen a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity;
and wherein, extracting entity information of the text to be processed comprises:
acquiring a plurality of preset term types, wherein each preset term type is associated with at least one first preset intent;
acquiring a word searching algorithm corresponding to each preset term type, wherein the word searching algorithm is used for searching words corresponding to each preset term type;
extracting words corresponding to the preset term types from the text to be processed according to the word searching algorithms corresponding to the preset term types to obtain a plurality of first target words of the text to be processed;
generating the entity information according to the plurality of first target words;
wherein obtaining the second candidate intent of the text to be processed according to the entity information includes:
acquiring a preset intent set, wherein the preset intent set comprises a plurality of second preset intents, and each second preset intent is associated with a plurality of preset words;

Date recue/Date received 2024-01-23 acquiring the plurality of first target words in the entity information;
screening out target intents from the preset intent set according to the first target words and preset words related to second preset intents in the preset intent set, and determining second candidate intents according to the target intents;
and wherein obtaining the second similarity between the entity information and the text to be processed includes:
performing word segmentation on the text to be processed to obtain a plurality of second target words of the text to be processed;
acquiring a first quantity of the first target words and a second quantity of the second target words; and acquiring the ratio of the first quantity to the second quantity, and determining the second similarity according to the ratio.
89. The equipment of claim 88 where the text to be processed is a Q&A type text information sent by a user to the service platform through the terminal.
90. The equipment of claim 88 wherein the text classification model outputs at least one candidate similar corpus and at least one similarity between the at least one candidate similar corpus and the text to be processed, and wherein the candidate similar corpus with the highest similarity with the text to be processed is selected to serve as the similar corpus.
91. The equipment of claim 88 wherein the text classification model is a Text-CNN model.
92. The equipment of claim 88 wherein the text classification model is trained after removing stopwords from Q&A corpora with marked sentence dimensions.
93. The equipment of claim 92 wherein stopwords are removed from the text to be processed before the text to be processed is input into the text classification model.
94. The equipment of any one of claims 88 to 93 wherein the service platform stores plural pieces of corpora with marked intents.

Date recue/Date received 2024-01-23
95. The equipment of any one of claims 88 to 93 wherein the service platform obtains a standard corpus according to the similar corpus and determines the first candidate intent according to the standard corpus.
96. The equipment of claim 95 wherein the first candidate intent of the text to be processed is determined according to a standard question.
97. The equipment of any one of claims 88 to 96 wherein the entity information includes segmented terms of the text to be processed.
98. The equipment of any one of claims 88 to 97 wherein the entity information includes category words, brand words, hot words, and/or keywords.
99. The equipment of any one of claims 88 to 98 wherein the entity information includes semantics of the text to be processed.
100. The equipment of any one of claims 88 to 99 wherein the service platform contains plural preset intents which correspond to associated information and the second candidate intent is determined according to a matching relationship between the entity information and the associated information of the plural preset intents.
101. The equipment of any one of claims 88 to 100 wherein the second similarity is a similarity between semantics of the entity information and the text to be processed.
102. The equipment of any one of claims 88 to 101 wherein the second similarity is determined according to the proportion of segmented terms of the text to be processed.
103. The equipment of any one of claims 88 to 102 wherein when the first similarity is greater than or equal to the second similarity the final intent is the first candidate intent.
104. The equipment of any one of claims 88 to 102 wherein when the first similarity is smaller than the second similarity the final intent is the second candidate intent.
105. The equipment of any one of claims 88 to 104 wherein the service platform sets a precondition.

Date recue/Date received 2024-01-23
106. The equipment of claim 105 wherein the precondition is that the first similarity is greater than a first present value and smaller than a second preset value, and that the first preset value is smaller than the second preset value.
107. The equipment of claim 106 wherein when the precondition is not met because the first similarity is greater than or equal to the second preset value the first candidate intent is set as the final intent.
108. The equipment of claim 106 wherein when the precondition is not met because the first similarity is smaller than or equal to the second preset value reminder information is generated.
109. The equipment of claim 88 wherein the plural tenn types includes category words, hot words, brand words, and keywords, each of which correspond to at least one first preset intent.
110. The equipment of claim 88 wherein the word searching algorithm is employed for searching terms to which the preset term types correspond.
111. The equipment of claim 110 wherein the word searching algorithm is a single word searching algorithm.
112. The equipment of claim 111 wherein the word searching algorithm is a dictionary tree searching algorithm.
113. The equipment of claim 88 wherein term segmentation is performed with NER
pickup by means of word dimension corpora to obtain the entity information.
114. The equipment according to claim 88, wherein the program is further configured to:
obtain a preset keyword;
perform term-matching between the preset keyword and the preset terms associated with the various second preset intents when the plural first target terms include the preset keyword, and screen a first target sub-candidate intent out of the plural second preset intents of the preset intent set according to a term-matching result, wherein the first target sub-candidate intent serves as the target intent; and Date recue/Date received 2024-01-23 perform term-matching between the plural first target terms respectively and the preset terms associated with the various second preset intents when the plural first target terms do not include the preset keyword, and screen a second target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the second target sub-candidate intent serves as the target intent.
115. The equipment of claim 114 wherein preset eliminated words set in the service platform are eliminated from the plural first target terms to obtain the plural object terms.
116. The equipment according to claim 114, wherein the program is further configured to:
obtain a first sub-similarity between a first target term to which the target intent corresponds and the text to be processed;
take a first sub-similarity with the highest similarity in plural first sub-similarities to serve as the second similarity, when there are plural target intents, and therefore there are plural first sub-similarities; and take the first sub-similarity to serve as the second similarity, when there is one target intent;
and take the first candidate intent to serve as the final intent of the text to be processed, when the first similarity is greater than or equal to the second similarity;
take the target intent to which the second similarity corresponds to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include plural target intents; and take the target intent in the second candidate intents to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include one target intent.
117.A computer readable physical memory having stored thereon a computer program executed by a computer configured to:

Date recue/Date received 2024-01-23 obtain, at a service platform through a terminal, a text to be processed;
input the text to be processed into a text classification model, and obtain a similar corpus of the text to be processed output from the text classification model and a first similarity between the similar corpus and the text to be processed, wherein the text classification model is trained according to corpora with marked intents;
determine a first candidate intent of the text to be processed according to the similar corpus;
extract entity information of the text to be processed, and obtain a second candidate intent of the text to be processed according to the entity information;
obtain a second similarity between the entity information and the text to be processed;
screen a final intent of the text to be processed out of the first candidate intent and the second candidate intent according to the first similarity and the second similarity;
and wherein, extracting entity information of the text to be processed comprises:
acquiring a plurality of preset term types, wherein each preset term type is associated with at least one first preset intent;
acquiring a word searching algorithm corresponding to each preset term type, wherein the word searching algorithm is used for searching words corresponding to each preset term type;
extracting words corresponding to the preset term types from the text to be processed according to the word searching algorithms corresponding to the preset term types to obtain a plurality of first target words of the text to be processed;
generating the entity information according to the plurality of first target words;
wherein obtaining the second candidate intent of the text to be processed according to the entity information includes:
Date recue/Date received 2024-01-23 acquiring a preset intent set, wherein the preset intent set comprises a plurality of second preset intents, and each second preset intent is associated with a plurality of preset words;
acquiring the plurality of first target words in the entity information;
screening out target intents from the preset intent set according to the first target words and preset words related to second preset intents in the preset intent set, and determining second candidate intents according to the target intents;
and wherein obtaining the second similarity between the entity information and the text to be processed includes:
performing word segmentation on the text to be processed to obtain a plurality of second target words of the text to be processed;
acquiring a first quantity of the first target words and a second quantity of the second target words; and acquiring the ratio of the first quantity to the second quantity, and determining the second similarity according to the ratio.
118. The memory of claim 117 where the text to be processed is a Q&A type text information sent by a user to the service platform through the terminal.
119. The memory of claim 117 wherein the text classification model outputs at least one candidate similar corpus and at least one similarity between the at least one candidate similar corpus and the text to be processed, and wherein the candidate similar corpus with the highest similarity with the text to be processed is selected to serve as the similar corpus.
120. The memory of claim 117 wherein the text classification model is a Text-CNN model.
121. The memory of claim 117 wherein the text classification model is trained after removing stopwords from Q&A corpora with marked sentence dimensions.
122. The memory of claim 121 wherein stopwords are removed from the text to be processed before the text to be processed is input into the text classification model.

Date recue/Date received 2024-01-23
123. The memory of any one of claims 117 to 122 wherein the service platform stores plural pieces of corpora with marked intents.
124. The memory of any one of claims 117 to 122 wherein the service platform obtains a standard corpus according to the similar corpus and determines the first candidate intent according to the standard corpus.
125. The memory of claim 124 wherein the first candidate intent of the text to be processed is determined according to a standard question.
126. The memory of any one of claims 117 to 125 wherein the entity information includes segmented terms of the text to be processed.
127. The memory of any one of claims 117 to 126 wherein the entity information includes category words, brand words, hot words, and/or keywords.
128. The memory of any one of claims 117 to 127 wherein the entity information includes semantics of the text to be processed.
129. The memory of any one of claims 117 to 128 wherein the service platform contains plural preset intents which correspond to associated information and the second candidate intent is determined according to a matching relationship between the entity information and the associated information of the plural preset intents.
130. The memory of any one of claims 117 to 129 wherein the second similarity is a similarity between semantics of the entity information and the text to be processed.
131. The memory of any one of claims 117 to 130 wherein the second similarity is determined according to the proportion of segmented terms of the text to be processed.
132. The memory of any one of claims 117 to 131 wherein when the first similarity is greater than or equal to the second similarity the final intent is the first candidate intent.
133. The memory of any one of claims 117 to 131 wherein when the first similarity is smaller than the second similarity the final intent is the second candidate intent.

Date recue/Date received 2024-01-23
134. The memory of any one of claims 117 to 133 wherein the service platform sets a precondition.
135. The memory of claim 134 wherein the precondition is that the first similarity is greater than a first present value and smaller than a second preset value, and that the first preset value is smaller than the second preset value.
136. The memory of claim 135 wherein when the precondition is not met because the first similarity is greater than or equal to the second preset value the first candidate intent is set as the final intent.
137. The memory of claim 135 wherein when the precondition is not met because the first similarity is smaller than or equal to the second preset value reminder information is generated.
138. The memory of claim 117 wherein the plural term types includes category words, hot words, brand words, and keywords, each of which correspond to at least one first preset intent.
139. The memory of claim 117 wherein a word searching algorithm is employed for searching terms to which the preset term types correspond.
140. The memory of claim 139 wherein the word searching algorithm is a single word searching algorithm.
141. The memory of claim 139 wherein the word searching algorithm is a dictionary tree searching algorithm.
142. The memory of claim 117 wherein term segmentation is performed with NER
pickup by means of word dimension corpora to obtain the entity information.
143. The memory according to claim 117, wherein the program is further configured to:
obtain a preset keyword;

Date recue/Date received 2024-01-23 perform term-matching between the preset keyword and the preset terms associated with the various second preset intents when the plural first target terms include the preset keyword, and screen a first target sub-candidate intent out of the plural second preset intents of the preset intent set according to a term-matching result, wherein the first target sub-candidate intent serves as the target intent; and perform term-matching between the plural first target terms respectively and the preset terms associated with the various second preset intents when the plural first target terms do not include the preset keyword, and screen a second target sub-candidate intent out of the plural second preset intents of the preset intent set according to the term-matching result, wherein the second target sub-candidate intent serves as the target intent.
144. The memory of claim 143 wherein preset eliminated words set in the service platform are eliminated from the plural first target terms to obtain the plural object terms.
145. The memory according to claim 143, wherein the program is further configured to:
obtain a first sub-similarity between a first target term to which the target intent corresponds and the text to be processed;
take a first sub-similarity with the highest similarity in plural first sub-similarities to serve as the second similarity, when there are plural target intents, and therefore there are plural first sub-similarities; and take the first sub-similarity to serve as the second similarity, when there is one target intent;
and take the first candidate intent to serve as the final intent of the text to be processed, when the first similarity is greater than or equal to the second similarity;
take the target intent to which the second similarity corresponds to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include plural target intents; and Date recue/Date received 2024-01-23 take the target intent in the second candidate intents to serve as the final intent of the text to be processed, when the first similarity is smaller than the second similarity and the second candidate intents include one target intent.
Date recue/Date received 2024-01-23
CA3174601A 2020-03-05 2020-06-19 Text intent identifying method, device, computer equipment and storage medium Active CA3174601C (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN202010146166.XA CN111325037B (en) 2020-03-05 2020-03-05 Text intention recognition method and device, computer equipment and storage medium
CN202010146166.X 2020-03-05
PCT/CN2020/097006 WO2021174717A1 (en) 2020-03-05 2020-06-19 Text intent recognition method and apparatus, computer device and storage medium

Publications (2)

Publication Number Publication Date
CA3174601A1 CA3174601A1 (en) 2021-09-10
CA3174601C true CA3174601C (en) 2024-04-02

Family

ID=71163911

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3174601A Active CA3174601C (en) 2020-03-05 2020-06-19 Text intent identifying method, device, computer equipment and storage medium

Country Status (3)

Country Link
CN (1) CN111325037B (en)
CA (1) CA3174601C (en)
WO (1) WO2021174717A1 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737962A (en) * 2020-06-24 2020-10-02 平安科技(深圳)有限公司 Entity revision method, entity revision device, computer equipment and readable storage medium
CN111931512A (en) * 2020-07-01 2020-11-13 联想(北京)有限公司 Statement intention determining method and device and storage medium
CN112231474A (en) * 2020-10-13 2021-01-15 中移(杭州)信息技术有限公司 Intention recognition method, system, electronic device and storage medium
CN112580350A (en) * 2020-12-30 2021-03-30 讯飞智元信息科技有限公司 Appeal analysis method and device, electronic equipment and storage medium
CN112668664B (en) * 2021-01-06 2022-11-15 安徽迪科数金科技有限公司 Intelligent voice-based conversational training method
CN113064984B (en) * 2021-04-25 2024-06-14 深圳壹账通智能科技有限公司 Intention recognition method, device, electronic equipment and readable storage medium
CN113836346B (en) * 2021-09-08 2023-08-08 网易(杭州)网络有限公司 Method, device, computing equipment and storage medium for generating abstract for audio file
CN114154509A (en) * 2021-11-26 2022-03-08 深圳集智数字科技有限公司 Intention determining method and device
CN114095282B (en) * 2022-01-21 2022-04-15 杭银消费金融股份有限公司 Wind control processing method and device based on short text feature extraction
CN114915514B (en) * 2022-03-28 2024-03-22 青岛海尔科技有限公司 Method and device for processing intention, storage medium and electronic device
CN115333768B (en) * 2022-06-29 2024-06-04 国家计算机网络与信息安全管理中心 Rapid studying and judging method for mass network attack
CN117556040A (en) * 2022-08-03 2024-02-13 马上消费金融股份有限公司 Text classification method, recognition method and device apparatus, storage medium
CN115859999B (en) * 2022-12-09 2023-07-07 河北尚云信息科技有限公司 Intention recognition method, device, electronic equipment and storage medium

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722558B (en) * 2012-05-29 2016-08-03 百度在线网络技术(北京)有限公司 A kind of method and apparatus recommending for user to put question to
CN104516986B (en) * 2015-01-16 2018-01-16 青岛理工大学 A kind of sentence recognition methods and device
CN104899285B (en) * 2015-06-04 2018-09-25 百度在线网络技术(北京)有限公司 Search result methods of exhibiting and device
CN105893444A (en) * 2015-12-15 2016-08-24 乐视网信息技术(北京)股份有限公司 Sentiment classification method and apparatus
US20170242886A1 (en) * 2016-02-19 2017-08-24 Jack Mobile Inc. User intent and context based search results
CN108536708A (en) * 2017-03-03 2018-09-14 腾讯科技(深圳)有限公司 A kind of automatic question answering processing method and automatically request-answering system
CN107168991B (en) * 2017-03-28 2020-12-04 北京三快在线科技有限公司 Search result display method and device
CN108334533B (en) * 2017-10-20 2021-12-24 腾讯科技(深圳)有限公司 Keyword extraction method and device, storage medium and electronic device
US20190163691A1 (en) * 2017-11-30 2019-05-30 CrowdCare Corporation Intent Based Dynamic Generation of Personalized Content from Dynamic Sources
US10776582B2 (en) * 2018-06-06 2020-09-15 International Business Machines Corporation Supporting combinations of intents in a conversation
CN109947909B (en) * 2018-06-19 2024-03-12 平安科技(深圳)有限公司 Intelligent customer service response method, equipment, storage medium and device
CN109033305B (en) * 2018-07-16 2022-04-01 深圳前海微众银行股份有限公司 Question answering method, device and computer readable storage medium
CN109285030A (en) * 2018-08-29 2019-01-29 深圳壹账通智能科技有限公司 Products Show method, apparatus, terminal and computer readable storage medium
CN109522393A (en) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 Intelligent answer method, apparatus, computer equipment and storage medium
CN109740152B (en) * 2018-12-25 2023-02-17 腾讯科技(深圳)有限公司 Text category determination method and device, storage medium and computer equipment
CN109785840B (en) * 2019-03-05 2021-01-29 湖北亿咖通科技有限公司 Method and device for identifying natural language, vehicle-mounted multimedia host and computer readable storage medium
CN109977294B (en) * 2019-04-03 2020-04-28 三角兽(北京)科技有限公司 Information/query processing device, query processing/text query method, and storage medium
CN110069631B (en) * 2019-04-08 2022-11-29 腾讯科技(深圳)有限公司 Text processing method and device and related equipment
CN110096570B (en) * 2019-04-09 2021-03-30 苏宁易购集团股份有限公司 Intention identification method and device applied to intelligent customer service robot
CN110232114A (en) * 2019-05-06 2019-09-13 平安科技(深圳)有限公司 Sentence intension recognizing method, device and computer readable storage medium
CN110276067B (en) * 2019-05-07 2022-11-22 创新先进技术有限公司 Text intention determining method and device
CN110162633B (en) * 2019-05-21 2022-02-11 深圳市珍爱云信息技术有限公司 Voice data intention determining method and device, computer equipment and storage medium
CN110209791B (en) * 2019-06-12 2021-03-26 百融云创科技股份有限公司 Multi-round dialogue intelligent voice interaction system and device
CN110427467B (en) * 2019-06-26 2022-10-11 深圳追一科技有限公司 Question-answer processing method, device, computer equipment and storage medium
CN110489538B (en) * 2019-08-27 2020-12-25 腾讯科技(深圳)有限公司 Statement response method and device based on artificial intelligence and electronic equipment
CN110704641B (en) * 2019-10-11 2023-04-07 零犀(北京)科技有限公司 Ten-thousand-level intention classification method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN111325037B (en) 2022-03-29
CN111325037A (en) 2020-06-23
WO2021174717A1 (en) 2021-09-10
CA3174601A1 (en) 2021-09-10

Similar Documents

Publication Publication Date Title
CA3174601C (en) Text intent identifying method, device, computer equipment and storage medium
CN109829629B (en) Risk analysis report generation method, apparatus, computer device and storage medium
US20200257860A1 (en) Semantic recognition method, electronic device, and computer-readable storage medium
CN110909539A (en) Word generation method, system, computer device and storage medium of corpus
CN111563384B (en) Evaluation object identification method and device for E-commerce products and storage medium
CN109947903B (en) Idiom query method and device
CA3138556A1 (en) Apparatuses, storage medium and method of querying data based on vertical search
CN112256845A (en) Intention recognition method, device, electronic equipment and computer readable storage medium
CN111274822A (en) Semantic matching method, device, equipment and storage medium
CN116244410A (en) Index data analysis method and system based on knowledge graph and natural language
CN115526171A (en) Intention identification method, device, equipment and computer readable storage medium
CN110489032B (en) Dictionary query method for electronic book and electronic equipment
US20220058214A1 (en) Document information extraction method, storage medium and terminal
CN112487154B (en) Intelligent search method based on natural language
CN113569988A (en) Algorithm model evaluation method and system
CN110750626A (en) Scene-based task-driven multi-turn dialogue method and system
CN115994232B (en) Online multi-version document identity authentication method, system and computer equipment
CN116150376A (en) Sample data distribution optimization method, device and storage medium
CN116418705A (en) Network asset identification method, system, terminal and medium based on machine learning
CN114154480A (en) Information extraction method, device, equipment and storage medium
CN113869043A (en) Content labeling method, device, equipment and storage medium
CN114169331A (en) Address resolution method, device, computer equipment and storage medium
CN113779364A (en) Searching method based on label extraction and related equipment thereof
CN111931480A (en) Method and device for determining main content of text, storage medium and computer equipment
CN112632268B (en) Complaint work order detection processing method, complaint work order detection processing device, computer equipment and storage medium

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20220906

EEER Examination request

Effective date: 20220906

EEER Examination request

Effective date: 20220906

EEER Examination request

Effective date: 20220906

EEER Examination request

Effective date: 20220906

EEER Examination request

Effective date: 20220906

EEER Examination request

Effective date: 20220906

EEER Examination request

Effective date: 20220906

EEER Examination request

Effective date: 20220906

EEER Examination request

Effective date: 20220906

EEER Examination request

Effective date: 20220906