WO2021159656A1 - Method, device, and equipment for semantic completion in a multi-round dialogue, and storage medium - Google Patents

Method, device, and equipment for semantic completion in a multi-round dialogue, and storage medium Download PDF

Info

Publication number
WO2021159656A1
WO2021159656A1 PCT/CN2020/098846 CN2020098846W WO2021159656A1 WO 2021159656 A1 WO2021159656 A1 WO 2021159656A1 CN 2020098846 W CN2020098846 W CN 2020098846W WO 2021159656 A1 WO2021159656 A1 WO 2021159656A1
Authority
WO
WIPO (PCT)
Prior art keywords
sentence
detection result
word
preset
supplementary
Prior art date
Application number
PCT/CN2020/098846
Other languages
French (fr)
Chinese (zh)
Inventor
黄孟缘
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021159656A1 publication Critical patent/WO2021159656A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • This application relates to the field of artificial intelligence, and in particular to methods, devices, equipment and storage media for semantic complement in multiple rounds of dialogue.
  • Human-computer interaction is a study of the interaction between systems and users.
  • the system can be a variety of machines or computerized systems and software.
  • the user communicates with the system through a visible human-computer interaction interface, and performs operations to realize the exchange of information between the user and the system to complete certain tasks.
  • Human-computer interaction is an important part of the field of artificial intelligence, especially in terms of customer service or consulting acquisition. Through human-computer interaction, users can obtain the information they need.
  • intelligent dialogue robots are used to communicate with users, and the feedback data of intelligent dialogue robots are used to meet the needs of users to obtain information.
  • the inventor realizes that because the intelligent dialogue robot obtains the user’s request, there is a situation of unclear semantics. If the user input request lacks the subject or object, the intelligent dialogue robot cannot accurately recognize the user’s intention and cannot accurately give feedback. The response information made according to the user's intention leads to the low accuracy of the intelligent dialogue robot's feedback.
  • the present application provides a method, device, device, and storage medium for semantic complement in multiple rounds of dialogue, which are used to solve the problem of incomplete sentence semantics based on multiple rounds of dialogue, improve the accuracy of semantic analysis results, and at the same time improve The accuracy of searching the corresponding response information based on the semantic analysis result is improved.
  • the first aspect of the embodiments of this application provides a method for semantic complement in multiple rounds of dialogue, including: using a preset corpus sentence segmentation function and a preset analysis function to perform grammatical detection on the first sentence input by the user, Obtain the first sentence detection result, the first sentence detection result is the target dependency relationship between each word in the first sentence; using the preset corpus segmentation function and the preset analysis function to obtain The second sentence input by the user in the new round is grammatically tested, and the second sentence detection result is obtained.
  • the second sentence detection result is the target dependency relationship between each word in the second sentence; Whether the sentence detection result includes only a single entity and the second sentence is an interrogative sentence; when the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, the first sentence detection result Complete the semantic missing part of the second sentence to obtain the first supplementary sentence; determine whether the first supplementary sentence includes unidentified words, and the unidentified words include pronouns, quantifiers, and articles; if If the first supplementary sentence includes an unknown word, the unknown word in the first supplementary sentence is replaced according to the detection result of the first sentence to obtain a second supplementary sentence.
  • a second aspect of the embodiments of the present application provides a device for complementing semantics in multiple rounds of dialogue, including: a first acquiring unit, configured to use a preset corpus segmentation function and a preset analysis function to compare the acquired first input by the user A sentence is checked for grammar, and a first sentence detection result is obtained.
  • the first sentence detection result is the target dependency relationship between each word in the first sentence;
  • the second acquisition unit is configured to use the preset corpus The sentence segmentation function and the preset analysis function perform grammatical detection on the acquired second sentence input by the user in a new round to obtain a second sentence detection result, and the second sentence detection result is each of the second sentences
  • the first judgment unit is used to judge whether the detection result of the second sentence includes only a single entity and the second sentence is a question sentence;
  • the first complement unit is used when the second sentence When the sentence detection result only includes a single entity and the second sentence is an interrogative sentence, according to the first sentence detection result, the semantic and semantic missing parts of the second sentence are filled in to obtain the first complementary sentence;
  • the second judgment unit Used to determine whether the first supplementary sentence includes unclear words, the unidentified words include pronouns, quantifiers, and articles;
  • the second supplement unit is used if the first supplement If the sentence includes an unclear word, then the unclear word in the first
  • the third aspect of the embodiments of the present application provides a device for complementing semantics in multiple rounds of dialogue, including a memory, a processor, and a computer program stored in the memory and running on the processor.
  • the processing When the computer program is executed by the computer program, the method based on semantic completion in multiple rounds of dialogues described in any of the above embodiments is implemented.
  • the input first sentence is grammatically checked, and the first sentence detection result is obtained.
  • the first sentence detection result is the target dependency relationship between each word in the first sentence;
  • the preset analysis function performs grammatical detection on the acquired second sentence inputted by the user in a new round to obtain the second sentence detection result.
  • the second sentence detection result is the difference between each word in the second sentence Target dependency relationship; determine whether the second sentence detection result includes only a single entity and the second sentence is a question sentence; when the second sentence detection result only includes a single entity and the second sentence is a question sentence, according to the According to the detection result of the first sentence, the missing part of the semantics of the second sentence is added to obtain the first supplementary sentence; it is judged whether the first supplementary sentence includes an unclear word, and the unclear word refers to it. Including pronouns, quantifiers, and articles; if the first supplementary sentence includes an unknown word, replace the unknown word in the first supplementary sentence according to the first sentence detection result to obtain the first supplementary sentence Two complete sentences.
  • the fourth aspect of the embodiments of the present application provides a computer-readable storage medium that stores instructions in the computer-readable storage medium, which when run on a computer, causes the computer to execute the method described in the first aspect, For example, the following steps are implemented: using the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, and the first sentence detection result is the first sentence The target dependency relationship between each word in a sentence; using the preset corpus sentence segmentation function and the preset analysis function to perform a grammatical check on the acquired second sentence input by the user in a new round to obtain the second sentence The detection result, the second sentence detection result is the target dependency relationship between each word in the second sentence; it is judged whether the second sentence detection result only includes a single entity and the second sentence is a question sentence; when When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, fill in the semantic missing part of the second sentence according to the first sentence detection
  • the embodiments of the present application improve the accuracy of the semantic analysis result, and at the same time also improve the accuracy of searching the corresponding response information according to the semantic analysis result.
  • FIG. 1 is a schematic diagram of an embodiment of a method for semantic completion in multiple rounds of dialogue in this application;
  • FIG. 2 is a schematic diagram of another embodiment of a method for semantic completion in multiple rounds of dialogue in this application;
  • FIG. 3 is a schematic diagram of an embodiment of a device for complementing semantics in multiple rounds of dialogue in this application;
  • FIG. 4 is a schematic diagram of another embodiment of a device for complementing semantics in multiple rounds of dialogue in this application;
  • Fig. 5 is a schematic diagram of an embodiment of a device for semantic completion in multiple rounds of dialogues of the present application.
  • the present application provides a method of semantic complement in multiple rounds of dialogue, which is used to solve the problem of incomplete sentence semantics based on multiple rounds of dialogue, improve the accuracy of semantic analysis results, and also improve the search for correspondence based on semantic analysis results. The accuracy of the response information.
  • the technical solution of this application can be applied to the field of artificial intelligence or big data technology.
  • the technical solution of this application can be implemented by a data platform such as a cloud computing platform.
  • an embodiment of the method for semantic completion in multiple rounds of dialogue in the embodiment of the present application includes:
  • the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the first sentence input by the user, and obtain the first sentence detection result.
  • the first sentence detection result is between each word in the first sentence The goal dependence relationship.
  • the server uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the first sentence input by the user, and obtain the first sentence detection result.
  • the first sentence detection result is the difference between each word in the first sentence. Target dependency.
  • the server uses the preset corpus sentence segmentation function to segment the first sentence and uses the preset analysis function to perform grammatical check on the first input sentence, both of which are preparations for complementing the semantics of the second sentence.
  • the server uses the preset corpus sentence segmentation function to preliminarily segment the first sentence.
  • the sentence segmentation is the basis of natural language processing.
  • the accuracy of the sentence segmentation directly determines the quality of the part-of-speech tagging, syntactic analysis, word vector, and text analysis of the sentence behind the server. For a simple first sentence, you can directly use a comma to segment the sentence.
  • the server will mark the words in the first sentence according to their position in the sentence.
  • the server uses the dependency analysis function to check the grammar of the first sentence.
  • the dependency analysis function uses the dependency relationship between the words in the sentence to express the grammar, such as: a When a word modifies another word, it is said that the word depends on another word, and the relationship between the words is clarified in the sentence.
  • the server can accurately determine the part of speech of each word in the sentence and the relationship between the words, which is a paired sentence Make full preparations to make up. Before performing dependency analysis, the server needs to tag the sentence part-of-speech. The server tags the words in the first sentence according to their position in the word.
  • the BMES tagging method can be used, where B is a word beginning with a word, and M It is a word in the middle of a word, the number of words in the middle of a word may be multiple, E is the last word of a word, and S is a word composed of one word.
  • B is a word beginning with a word
  • M It is a word in the middle of a word, the number of words in the middle of a word may be multiple
  • E is the last word of a word
  • S is a word composed of one word.
  • B is a word beginning with a word
  • M It is a word in the middle of a word, the number of words in the middle of a word may be multiple
  • E is the last word of a word
  • S is a word composed of one word.
  • the result after labeling is "BMMESBMMEBMMMESBMEBE”
  • the corresponding word segmentation result is "Online Merchant Bank/Yes/Ant Financial Services/WeChat Loan Division/The
  • the preset corpus sentence segmentation function and the preset analysis function to perform a grammatical check on the acquired second sentence input by the user in a new round, and obtain the second sentence detection result.
  • the second sentence detection result is each of the second sentences.
  • the server uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in the new round, and obtain the second sentence detection result.
  • the second sentence detection result is each word in the second sentence The goal dependence relationship between.
  • the second sentence here is the following sentence of the first sentence, and the first sentence serves as the basis for filling in the missing part of the second sentence.
  • the grammatical detection method for the second sentence is the same as the grammatical detection method for the first sentence, and the detection result is used as the second sentence detection result.
  • the server determines whether the detection result of the second sentence includes only a single entity and the second sentence is an interrogative sentence.
  • part of the sentence may be omitted in the second sentence input by the user in the new round, such as omitting the predicate and object.
  • the server needs to judge the second sentence input in the new round. If the semantic expression of the second sentence is not complete, the second sentence entered in the new round will be completed according to the context in the multiple rounds of dialogue. Therefore, the server first determines whether the analysis result of the second sentence contains only a single entity and the second sentence It is an interrogative sentence, which is used to determine whether the second sentence entered in the new round satisfies the condition of semantic incompleteness.
  • the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, fill in the semantic missing part of the second sentence according to the first sentence detection result to obtain the first supplementary sentence.
  • the server fills in the semantic missing part of the second sentence according to the first sentence detection result to obtain the first supplementary sentence.
  • the server cannot accurately identify the meaning of the second sentence and cannot feed back the corresponding preset response. Therefore, the server fills in the second sentence according to the acquired context , A complete semantics of the second sentence is obtained, and the server feeds back the corresponding answer according to the semantics of the second sentence. The server obtains the first sentence detection result and the second sentence detection result.
  • Both detection results include part-of-speech analysis of the sentence, and extract the single entity in the second sentence as the first target word, where the single entity refers to It is a single word or a single phrase in a sentence, for example: the sentence "What about a cold?", where the word "a cold" is a single entity in the sentence.
  • the single entity in the second sentence select the single entity with the same grammatical structure as the first target word in the first sentence as the second target word.
  • the same grammatical structure here refers to the first target word and the first target word.
  • the two target words have the same part of speech, for example: the first target word is "cold”, the second target word is "cancer", the first target word and the second target word are both noun parts of speech, it can be said that the two have the same grammatical structure.
  • the server filters out the second target words, in the first sentence, the server replaces the first target words with the second target words to obtain the first supplementary sentence.
  • the server determines whether the first supplementary sentence includes unclear words, which include pronouns, quantifiers, and articles.
  • the server when the server completes a new round of input of the second sentence and obtains the first supplementary sentence, it must also determine whether the first supplementary sentence includes unclear words, which will lead to The semantics of the second sentence is unclear.
  • unclear words include pronouns, quantifiers and articles.
  • the first sentence of the user is "Are both Ping An Fu and Fu Tianxin medical insurance?”
  • the server replies to the first sentence as "Yes.”
  • the second sentence entered by the user in a new round is "The first type can be reimbursed.” What?" In the second sentence entered by the user in the new round, there is an unknown term "the first type”.
  • the server does not contact the context of multiple rounds of dialogue to supplement the second sentence entered by the user in the new round, then It will lead to a deviation in semantic understanding, the server will not be able to give the correct answer, and the user will not be able to proceed to the next step.
  • the first supplementary sentence includes an unknown word, replace the unknown word in the first supplementary sentence according to the detection result of the first sentence to obtain a second supplementary sentence.
  • the server replaces the unknown word in the first supplementary sentence according to the detection result of the first sentence to obtain a second supplementary sentence.
  • the server determines that the first supplementary sentence includes an unknown word, indicating that the first supplementary sentence is a sentence with ambiguous semantics
  • the server needs to supplement the first supplementary sentence again according to the context in multiple rounds of dialogue.
  • the server extracts the unknown words from the second sentence as the third target word.
  • it selects words with the same part of speech as the third target word in the first sentence as the fourth target word.
  • the second sentence replace the fourth target word with the third target word.
  • the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on multiple rounds of dialogue input by the user, and using the semantics of the multiple rounds of dialogue context to complement sentences with incomplete semantics, the results of semantic analysis are improved. At the same time, the accuracy of searching the corresponding response information based on the semantic analysis result is also improved.
  • FIG. 2 another embodiment of the method for semantic completion in multiple rounds of dialogue in the embodiment of the present application includes:
  • the server obtains the first sentence entered by the user.
  • the server obtains the sentence input by the user, and uses the sentence input by the user as the first sentence to prepare for the completion of the second sentence below.
  • the server uses the preset corpus sentence segmentation function to segment the first sentence to obtain the first input sentence. Specifically, the server segments the first sentence to obtain a segmented sentence; the server matches the segmented corpus in the segmented sentence with a preset corpus, which is a corpus established in a preset intent rule base based on business data ; If the segmented corpus matches the preset corpus, the server will segment the segmented sentence before and after the segmented corpus to obtain the segmented sentence, and use the segmented sentence as the first input sentence; if the segmented corpus matches the preset corpus If it does not match, the server directly uses the segmented corpus as the first input sentence.
  • a preset corpus which is a corpus established in a preset intent rule base based on business data .
  • the server uses the preset corpus sentence segmentation function to segment the first sentence and uses the preset analysis function to perform grammatical check on the first input sentence, both of which are preparations for complementing the semantics of the second sentence.
  • the server uses the preset corpus sentence segmentation function to preliminarily segment the first sentence.
  • the sentence segmentation is the basis of natural language processing.
  • the accuracy of sentence segmentation directly determines the quality of part-of-speech tagging, syntactic analysis, word vector and text analysis behind the server. For a simple first sentence, you can directly use a comma to segment the sentence.
  • the server will mark the words in the first sentence according to their position in the sentence. For example: mark the sentence "I love Shenzhen because it is beautiful".
  • the server After segmenting the first sentence, the server obtains the separated segmented sentence, and then matches the segmented corpus in the segmented sentence with the preset corpus.
  • the segmented sentence here is a sentence that cannot be segmented, so Further match the segmented corpus in the segmented sentence, and clarify whether the corresponding corpus needs to be segmented.
  • the preset corpus here is the corpus built in the preset intent rule base based on business data.
  • the preset intent rule base is a vault, and the preset corpus can be an insurance policy, an ID number of the insured, etc., based on the server.
  • Business data constitutes corpus.
  • the server will segment the segmented sentence before and after the segmented corpus to obtain the segmented sentence, and use the segmented sentence as the first input sentence; if the segmented corpus does not match the preset corpus Match, the server directly uses the segmented corpus as the first input sentence. For example: match the sentence "Can a cold be insured for e-life insurance?" After matching the segmented corpus, it is divided into “cold/can/insured/e-life insurance/?”, where "cold”, “insurance”, “e-life insurance” All are the preset corpus in the preset intent rule library. The server does not need to segment the segmented corpus that matches the preset corpus, and directly divides the segmented corpus before and after the matched segmented corpus, saving the server to segment the sentence time.
  • the first sentence detection result is a target dependency relationship between each word in the first input sentence.
  • the server uses a preset analysis function to perform grammatical detection on the first input sentence to obtain a first sentence detection result.
  • the first sentence detection result is the target dependency relationship between each word in the first input sentence.
  • the server performs part-of-speech tagging and entity extraction on the words in the first input sentence to obtain the first sentence tagging result;
  • the server calculates the dependency probability between each word, the word is the word in the first sentence tagging result, and the dependency probability It is the frequency of occurrence of the preset dependence relationship;
  • the server determines the target dependence relationship between each word, and the preset dependence relationship corresponding to the probability with the largest dependence probability weight is the target dependence relationship between the words;
  • the server obtains the first sentence detection result ,
  • the first sentence detection result is the target dependency relationship between each word in the first input sentence.
  • the preset analysis function here refers to the dependency analysis function.
  • the server uses the dependency analysis function to check the grammar of the first sentence.
  • the dependency analysis function uses the dependency relationship between the words in the sentence to express the grammar, such as: a When a word modifies another word, it is said that the word depends on another word, and the relationship between the words is clarified in the sentence.
  • the server can accurately determine the part of speech of each word in the sentence and the relationship between the words, which is a paired sentence Make full preparations to make up. Before the server performs dependency analysis, the sentence needs to be marked with part of speech. The server marks the words in the first sentence according to their position in the word.
  • the BMES marking method can be used, where B is a word beginning with a word, and M is For a word in the middle of a word, the number of words in the middle of a word may be more than one. E is the last word of a word, and S is a word consisting of one word. For example, mark “Online Merchant Bank is the most important product of Ant Financial's microfinance division”, the result after labeling is "BMMESBMMEBMMMESBMEBE", and the corresponding word segmentation result is "Online Merchant Bank/Yes/Ant Financial Services/WeChat Loan Division/The/Most Important/Product”, search for the part of speech of the corresponding word segmentation in the preset vocabulary, and mark the word segmentation result with part of speech.
  • the sentence includes only the subject, predicate or object
  • the server uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in the new round, and obtain the second sentence detection result.
  • the second sentence detection result is each word in the second sentence The goal dependence relationship between.
  • the second sentence here is the following sentence of the first sentence, and the first sentence serves as the basis for filling in the missing part of the second sentence.
  • the grammatical detection method for the second sentence is the same as the grammatical detection method for the first sentence, and the detection result is used as the second sentence detection result.
  • the server determines whether the detection result of the second sentence includes only a single entity and the second sentence is an interrogative sentence.
  • part of the sentence may be omitted in the second sentence input by the user in the new round, such as omitting the predicate and object.
  • the server needs to judge the second sentence input in the new round. If the semantic expression of the second sentence is not complete, the second sentence entered in the new round will be completed according to the context in the multiple rounds of dialogue. Therefore, the server first determines whether the analysis result of the second sentence contains only a single entity and the second sentence It is an interrogative sentence to judge whether the second sentence entered in the new round satisfies the condition of semantic incompleteness.
  • the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, fill in the semantic missing part of the second sentence according to the first sentence detection result to obtain the first supplementary sentence.
  • the server completes the semantics of the second sentence according to the first sentence detection result to obtain the first supplementary sentence.
  • the server extracts the single entity in the second sentence detection result and uses the single entity as the first target word; the server detects in the first sentence In the result, single entities with the same grammatical structure as the first target word are selected to obtain the second target word; in the first sentence, the server replaces the second target word with the first target word to obtain the first supplementary sentence.
  • the server cannot accurately identify the meaning of the second sentence, and cannot feed back the corresponding preset response. Therefore, the server will determine the meaning of the second sentence according to the acquired context. The semantic missing parts are filled in to obtain the complete semantics of the second sentence, and the server then feeds back the corresponding answer according to the semantics of the second sentence.
  • the server obtains the first sentence detection result and the second sentence detection result. Both detection results include part-of-speech analysis of the sentence, extract the single entity in the second sentence as the first target word, and detect it in the first sentence In the result, a single entity with the same grammatical structure as the first target word is selected as the second target word. In the first sentence, the server replaces the first target word with the second target word to obtain the first supplementary sentence.
  • the server performs part-of-speech tagging and dependency analysis on the first sentence input by the user, and obtains "Cold/nhd Yes/c Apply for insurance/vn e ⁇ / nbx what/y?”
  • nhd and nbx are related parts of speech of the preset corpus, for example: the preset intent rule base where the preset corpus is located is the vault, nhd represents the word of disease in the vault, and nbx represents insurance Words like insurance names in the library; c stands for auxiliary verbs; vn stands for gerunds; y stands for modal auxiliary words.
  • the extracted single entity is: cold/nhd insurance/vn e ⁇ /nbx.
  • the user enters a new round of the second sentence as "Cancer?"
  • the server parses this sentence and gets "Cancer/nhd which/y". It is obvious that the second sentence only includes a single entity and the second sentence is a rhetorical question.
  • the server will supplement the second sentence. According to the detection result of the first sentence and the detection result of the second sentence, it is obtained that cold/nhd and cancer/nhd are the same type of words. In the first sentence, you can get "Can cancer be insured for e-life insurance?" and fill in the second sentence.
  • the server determines whether the first supplementary sentence includes unclear words, which include pronouns, quantifiers, and articles. Specifically, the server obtains the detection result of the first supplementary sentence, the detection result of the first supplementary sentence is a combination of the second sentence detection result and the first sentence detection result; the server determines whether the detection result of the first supplementary sentence includes Pronouns, quantifiers, and articles.
  • the server when the server completes a new round of input of the second sentence and obtains the first supplementary sentence, it must also determine whether the first supplementary sentence includes unclear words, which will lead to The semantics of the second sentence is unclear.
  • unclear words include pronouns, quantifiers and articles.
  • the first sentence of the user is "Are both Ping An Fu and Fu Dan for medical insurance?”
  • the server replies to the first sentence as "Yes.”
  • the second sentence entered by the user in a new round is "The first type can be reimbursed.” What?" In the second sentence entered by the user in the new round, there is an unknown term "the first type”.
  • the server does not contact the context of multiple rounds of dialogue to supplement the second sentence entered by the user in the new round, then It will lead to a deviation in semantic understanding, the server will not be able to give the correct answer, and the user will not be able to proceed to the next step.
  • the first supplementary sentence includes an unknown word
  • the server replaces the unknown word in the first supplementary sentence according to the detection result of the first sentence to obtain a second supplementary sentence. Specifically, if the detection result of the first supplementary sentence includes an unknown word, the server extracts the unknown word, and uses the unknown word as the third target word; the server is in the first sentence detection result The words with the same grammatical structure as the third target word are selected to obtain the fourth target word; in the second sentence, the server replaces the third target word with the fourth target word to obtain the second supplementary sentence.
  • the server determines that the first supplementary sentence includes an unknown word, indicating that the first supplementary sentence is a sentence with ambiguous semantics
  • the server needs to supplement the first supplementary sentence again according to the context in multiple rounds of dialogue.
  • the server extracts the unknown words from the second sentence as the third target word.
  • it selects words with the same part of speech as the third target word in the first sentence as the fourth target word. Replace the fourth target word with the third target word in the second sentence.
  • the server performs part-of-speech tagging and dependency analysis on the first sentence input by the user, and obtains "Ping An Fu/nbx, Fu Yuan/ nbx, is it /vMedical Insurance/nbx/y?”
  • nbx is the relevant part of speech of the preset corpus, for example: the preset intent rule library where the preset corpus is located is the vault, and nbx represents the name of the insurance in the vault.
  • the second sentence entered by the user in the new round is "What can the former be reimbursed?"
  • the server performs part-of-speech tagging and dependency analysis on the second sentence entered by the user in the new round, and obtains "the former/r ⁇ /v reimbursement/vwhat/y "?” where r stands for pronouns; v stands for verbs; y stands for modal auxiliary words.
  • r stands for pronouns
  • v stands for verbs
  • y stands for modal auxiliary words.
  • the second sentence entered in the round is supplemented, combined with the detection result of the user's first sentence, the former here refers to "Ping An Fu", and "Ping An Fu” is directly brought into the second sentence to get "What is Ping An Fu reimbursed?" ?” Get a complete second supplementary sentence, so that the server can give feedback and answers based on the second supplementary sentence.
  • the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on multiple rounds of dialogue input by the user, and using the semantics of the multiple rounds of dialogue context to complement sentences with incomplete semantics, the results of semantic analysis are improved. At the same time, the accuracy of searching the corresponding response information based on the semantic analysis result is also improved.
  • An embodiment of the device for semantic completion in the middle includes:
  • the first acquiring unit 301 is configured to perform grammatical detection on the acquired first sentence input by the user by using the preset corpus sentence segmentation function and the preset analysis function to obtain the first sentence detection result, and the first sentence detection result is the first sentence
  • the second acquisition unit 302 is configured to use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence inputted by the user in a new round to obtain the second sentence detection result, and the second sentence detection result is The target dependence relationship between each word in the second sentence;
  • the first judgment unit 303 is configured to judge whether the detection result of the second sentence includes only a single entity and the second sentence is a question sentence;
  • the first complementing unit 304 is used for when the second sentence detection result only includes a single entity and the second sentence is a question sentence, according to the first sentence detection result, the semantic missing part of the second sentence is complemented to obtain the first complementary sentence ;
  • the second judging unit 305 is used to judge whether the first supplementary sentence includes unclear words, which include pronouns, quantifiers, and articles;
  • the second supplementary unit 306 is configured to, if the first supplementary sentence includes an unknown word, replace the unknown word in the first supplementary sentence according to the first sentence detection result to obtain a second supplementary sentence.
  • the first acquiring unit 301 uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, and the first sentence detection result is The target dependency relationship between each word in the first sentence;
  • the second acquiring unit 302 uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in the new round, and obtain the first sentence
  • the second sentence detection result, the second sentence detection result is the target dependency relationship between each word in the second sentence;
  • the first judgment unit 303 judges whether the second sentence detection result only includes a single entity and the second sentence is a question sentence;
  • the complementing unit 304 when the detection result of the second sentence only includes a single entity and the second sentence is an interrogative sentence, fills in the missing part of the semantics of the second sentence according to the detection result of the first sentence to obtain the first complementary sentence;
  • the second judgment unit 305 Determine whether the first supplementary sentence includes unknown
  • the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on multiple rounds of dialogue input by the user, and using the semantics of the multiple rounds of dialogue context to complement sentences with incomplete semantics, the results of semantic analysis are improved. At the same time, the accuracy of searching the corresponding response information based on the semantic analysis result is also improved.
  • another embodiment of the device for semantic complement in multiple rounds of dialogue in the embodiment of the present application includes:
  • the first acquiring unit 301 is configured to use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, and the first sentence detection result is the first sentence
  • the second acquisition unit 302 is configured to use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence inputted by the user in a new round to obtain the second sentence detection result, and the second sentence detection result is The target dependence relationship between each word in the second sentence;
  • the first judgment unit 303 is configured to judge whether the detection result of the second sentence includes only a single entity and the second sentence is a question sentence;
  • the first complementing unit 304 is used for when the second sentence detection result only includes a single entity and the second sentence is a question sentence, according to the first sentence detection result, the semantic missing part of the second sentence is complemented to obtain the first complementary sentence ;
  • the second judging unit 305 is used to judge whether the first supplementary sentence includes unclear words, which include pronouns, quantifiers, and articles;
  • the second supplementary unit 306 is configured to, if the first supplementary sentence includes an unknown word, replace the unknown word in the first supplementary sentence according to the first sentence detection result to obtain a second supplementary sentence.
  • the first obtaining unit 301 includes:
  • the obtaining module 3011 is used to obtain the first sentence input by the user;
  • the segmentation module 3012 is used to segment the first sentence by using a preset corpus sentence segmentation function to obtain the first input sentence;
  • the detection module 3013 is configured to perform grammatical detection on the first input sentence by using a preset analysis function to obtain a first sentence detection result.
  • the first sentence detection result is the target dependency relationship between each word in the first input sentence.
  • the segmentation module 3012 is specifically used for:
  • the preset corpus is the corpus built in the preset intent rule base based on business data;
  • segmented corpus matches the preset corpus, segment the segmented sentence before and after the segmented corpus to obtain the segmented sentence, and use the segmented sentence as the first input sentence;
  • the segmented corpus does not match the preset corpus, the segmented corpus is directly used as the first input sentence.
  • the detection module 3013 is specifically used for:
  • the word is the word in the first sentence labeling result, and the dependence probability is the frequency of the preset dependence relationship;
  • the first sentence detection result is obtained, and the first sentence detection result is the target dependency relationship between each word in the first input sentence.
  • the first complementing unit 304 is specifically used for:
  • the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, extract the single entity in the second sentence detection result, and use the single entity as the first target word;
  • the second judgment unit 305 is specifically configured to:
  • the detection result of the first supplementary sentence being a combination of the second sentence detection result and the first sentence detection result;
  • the second complementing unit 306 is specifically configured to:
  • the detection result of the first supplementary sentence includes an unclear word
  • the unclear word is extracted, and the unclear word is used as the third target word
  • the first obtaining unit 301 includes: an obtaining module 3011, configured to obtain the first sentence input by the user; Input sentence; detection module 3013, used for grammatical detection of the first input sentence using a preset analysis function, to obtain the first sentence detection result, the first sentence detection result is the target dependency relationship between each word in the first input sentence ;
  • the second acquisition unit 302 uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in the new round, to obtain the second sentence detection result, and the second sentence detection result is the second The target dependency relationship between each word in the sentence;
  • the first judgment unit 303 judges whether the detection result of the second sentence only includes a single entity and the second sentence is a question sentence; the first complement unit 304 when the detection result of the second sentence only includes a single
  • the entity and the second sentence is an interrogative sentence, according to the detection result of the first sentence, the semantic missing part of the second sentence is supplemented to obtain the first supplementary
  • the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on multiple rounds of dialogue input by the user, and using the semantics of the multiple rounds of dialogue context to complement sentences with incomplete semantics, the results of semantic analysis are improved. At the same time, the accuracy of searching the corresponding response information based on the semantic analysis result is also improved.
  • FIG. 5 is a schematic structural diagram of a device for semantic completion in a multi-round dialogue provided by an embodiment of the present application.
  • the device 500 for semantic completion in a multi-round dialogue may have relatively large differences due to different configurations or performance, which may include One or more processors (central processing units, CPU) 501 (for example, one or more processors) and memory 509, and one or more storage media 508 (for example, one or more storage mediums 508 for storing application programs 507 or data 506) 501 (for example, one or more processors) Storage device).
  • the memory 509 and the storage medium 508 may be short-term storage or persistent storage.
  • the program stored in the storage medium 508 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in a device that complements semantics in multiple rounds of dialogue. Further, the processor 501 may be configured to communicate with the storage medium 508, and execute a series of instruction operations in the storage medium 508 on the device 500 with semantic completion in multiple rounds of dialogue.
  • the device 500 for supplementing semantics in multiple rounds of dialogue may also include one or more power sources 502, one or more wired or wireless network interfaces 503, one or more input and output interfaces 504, and/or, one or more operating systems 505, such as Windows Serve, Mac OS X, Unix, Linux, FreeBSD, etc.
  • operating systems 505 such as Windows Serve, Mac OS X, Unix, Linux, FreeBSD, etc.
  • FIG. 5 does not constitute a limitation on the device for semantic completion in multiple rounds of dialogue, and may include more or less than that shown in the figure. Components, or a combination of certain components, or different component arrangements.
  • the processor 501 is the control center of the device that completes semantics in multiple rounds of dialogue, and can perform processing in accordance with the method of semantic complement in multiple rounds of dialogue.
  • the processor 501 uses various interfaces and lines to connect the various parts of the semantically supplemented device in the entire multi-round dialogue, by running or executing software programs and/or modules stored in the memory 509, and calling data stored in the memory 509 ,
  • Use preset corpus sentence segmentation function and preset analysis function to grammatically detect multiple rounds of dialogue input by users, and use the semantics of multiple rounds of dialogue context to complement sentences with incomplete semantics, which improves the accuracy of semantic analysis results.
  • the storage medium 508 and the memory 509 are both carriers for storing data.
  • the storage medium 508 may refer to an internal memory with a small storage capacity but a fast speed, and the storage medium 509 may have a large storage capacity but a slow storage speed. External memory.
  • the memory 509 may be used to store software programs and modules.
  • the processor 501 executes various functional applications and data processing of the device 500 with semantic complement in multiple rounds of dialogues by running the software programs and modules stored in the memory 509.
  • the memory 509 may mainly include a storage program area and a storage data area.
  • the storage program area may store an operating system, an application program required by at least one function, etc.; The created data, etc.
  • the memory 509 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • the semantically supplemented program and the received data stream are stored in the memory, and the processor 501 is called from the memory 509 when it needs to be used.
  • the computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • Computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • Computer instructions may be transmitted from a website, computer, server, or data center through a cable (such as Coaxial cable, optical fiber, twisted pair) or wireless (such as infrared, wireless, microwave, etc.) to transmit to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, an optical disc), or a semiconductor medium (for example, a solid state disk (SSD)).
  • the computer-readable storage medium may be non-volatile or volatile.

Abstract

A method, device, and equipment for semantic completion in a multi-round dialogue, and a storage medium, grammar check is performed with respect to the multi-round dialogue via a preset corpus segmentation function and a preset analysis function, and semantically incomplete sentences are completed, thus increasing the accuracy of a semantic analysis result and the accuracy of searching for corresponding response information on the basis of the semantic analysis result. The method comprises: utilizing a preset corpus segmentation function and a preset analysis function to respectively perform grammar checks with respect to a first sentence and a second sentence to produce a first sentence check result and a second sentence check result; when the second sentence check result comprises a single entity and the second sentence is a question, completing a semantically missing part of the second sentence on the basis of the first sentence check result to produce a first completed sentence (104); and if the first completed sentence comprises an unclear word, then replacing the unclear word in the first completed sentence on the basis of the first sentence check result to produce a second completed sentence (106).

Description

多轮对话中语义补齐的方法、装置、设备及存储介质Method, device, equipment and storage medium for complementing semantics in multiple rounds of dialogue
本申请要求于2020年2月12日提交中国专利局、申请号为202010088078.9,发明名称为“多轮对话中语义补齐的方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 12, 2020, the application number is 202010088078.9, and the invention title is "Methods, Apparatus, Equipment, and Storage Media for Semantic Completion in Multi-round Dialogues". The entire content is incorporated into this application by reference.
技术领域Technical field
本申请涉及人工智能领域,尤其涉及多轮对话中语义补齐的方法、装置、设备及存储介质。This application relates to the field of artificial intelligence, and in particular to methods, devices, equipment and storage media for semantic complement in multiple rounds of dialogue.
背景技术Background technique
人机交互(human-computer interaction,HCI)是一门研究系统与用户之间的交互关系的学问。系统可以是各种各样的机器,也可以是计算机化的系统和软件。通常用户利用可见的人机交互界面与系统交流,并进行操作实现用户与系统之间的信息交换,以完成确定任务。人机交互是人工智能领域中重要的一部分,尤其是在客户服务或咨询获取方面,通过人机交互令用户获得需要的信息。Human-computer interaction (HCI) is a study of the interaction between systems and users. The system can be a variety of machines or computerized systems and software. Generally, the user communicates with the system through a visible human-computer interaction interface, and performs operations to realize the exchange of information between the user and the system to complete certain tasks. Human-computer interaction is an important part of the field of artificial intelligence, especially in terms of customer service or consulting acquisition. Through human-computer interaction, users can obtain the information they need.
在客户服务方面,目前通过智能对话机器人与用户进行沟通,并利用智能对话机器人反馈数据来满足用户获取信息的需求。In terms of customer service, currently, intelligent dialogue robots are used to communicate with users, and the feedback data of intelligent dialogue robots are used to meet the needs of users to obtain information.
但是发明人意识到,由于智能对话机器人获取到用户的提出要求存在语义不明的情况,如存在用户输入的要求缺少主语或宾语的情况,智能对话机器人无法准确识别出用户的意图,不能准确反馈出依据用户意图作出的应答信息,导致智能对话机器人反馈的准确率不高。However, the inventor realizes that because the intelligent dialogue robot obtains the user’s request, there is a situation of unclear semantics. If the user input request lacks the subject or object, the intelligent dialogue robot cannot accurately recognize the user’s intention and cannot accurately give feedback. The response information made according to the user's intention leads to the low accuracy of the intelligent dialogue robot's feedback.
发明内容Summary of the invention
本申请的提供一种多轮对话中语义补齐的方法、装置、设备及存储介质,用于解决基于多轮对话下,语句语义不全的问题,提高了语义分析结果的准确率,同时也提高了根据语义解析结果搜索对应的应答信息的准确率。The present application provides a method, device, device, and storage medium for semantic complement in multiple rounds of dialogue, which are used to solve the problem of incomplete sentence semantics based on multiple rounds of dialogue, improve the accuracy of semantic analysis results, and at the same time improve The accuracy of searching the corresponding response information based on the semantic analysis result is improved.
为本申请实施例的第一方面提供一种多轮对话中语义补齐的方法,包括:利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一语句中每个词语之间的目标依存关系;利用所述预置语料断句函数与所述预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,所述第二语句检测结果为所述第二语句中每个词语之间的目标依存关系;判断所述第二语句检测结果是否仅包括单实体且所述第二语句为疑问句;当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义缺失部分补齐,得到第一补齐语句;判断所述第一补齐语句中是否包括指代不明的词语,所述指代不明的词语包括代词、量词以及冠词;若所述第一补齐语句中包括指代不明的词语,则根据所述第一语句检测结果将所述第一补齐语句中指代不明的词语替换,得到第二补齐语句。The first aspect of the embodiments of this application provides a method for semantic complement in multiple rounds of dialogue, including: using a preset corpus sentence segmentation function and a preset analysis function to perform grammatical detection on the first sentence input by the user, Obtain the first sentence detection result, the first sentence detection result is the target dependency relationship between each word in the first sentence; using the preset corpus segmentation function and the preset analysis function to obtain The second sentence input by the user in the new round is grammatically tested, and the second sentence detection result is obtained. The second sentence detection result is the target dependency relationship between each word in the second sentence; Whether the sentence detection result includes only a single entity and the second sentence is an interrogative sentence; when the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, the first sentence detection result Complete the semantic missing part of the second sentence to obtain the first supplementary sentence; determine whether the first supplementary sentence includes unidentified words, and the unidentified words include pronouns, quantifiers, and articles; if If the first supplementary sentence includes an unknown word, the unknown word in the first supplementary sentence is replaced according to the detection result of the first sentence to obtain a second supplementary sentence.
本申请实施例的第二方面提供一种多轮对话中语义补齐的装置,包括:第一获取单元,用于利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一语句中每个词语之间的目标依存关系;第二获取单元,用于利用所述预置语料断句函数与所述预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,所述第二语句检测结果为所述第二语句中每个词语之间的目标依存关系;第一判断单元,用于判断所述第二语句检测结果是否仅包括单实体且所述第二语句为疑问句;第一补齐单元,用于当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义语义缺失部分补齐,得到第一补齐语句;第二判断单元,用于判断所述第一补齐语句中是否包括指代不明的词语,所述指代不明的词语包括代词、量词以 及冠词;第二补齐单元,用于若所述第一补齐语句中包括指代不明的词语,则根据所述第一语句检测结果将所述第一补齐语句中指代不明的词语替换,得到第二补齐语句。A second aspect of the embodiments of the present application provides a device for complementing semantics in multiple rounds of dialogue, including: a first acquiring unit, configured to use a preset corpus segmentation function and a preset analysis function to compare the acquired first input by the user A sentence is checked for grammar, and a first sentence detection result is obtained. The first sentence detection result is the target dependency relationship between each word in the first sentence; the second acquisition unit is configured to use the preset corpus The sentence segmentation function and the preset analysis function perform grammatical detection on the acquired second sentence input by the user in a new round to obtain a second sentence detection result, and the second sentence detection result is each of the second sentences The target dependency relationship between words; the first judgment unit is used to judge whether the detection result of the second sentence includes only a single entity and the second sentence is a question sentence; the first complement unit is used when the second sentence When the sentence detection result only includes a single entity and the second sentence is an interrogative sentence, according to the first sentence detection result, the semantic and semantic missing parts of the second sentence are filled in to obtain the first complementary sentence; the second judgment unit , Used to determine whether the first supplementary sentence includes unclear words, the unidentified words include pronouns, quantifiers, and articles; the second supplement unit is used if the first supplement If the sentence includes an unclear word, then the unclear word in the first supplementary sentence is replaced according to the detection result of the first sentence to obtain a second supplementary sentence.
本申请实施例的第三方面提供了一种多轮对话中语义补齐的设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述任一实施方式所述的基于多轮对话中语义补齐的方法,例如,实现以下步骤:利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一语句中每个词语之间的目标依存关系;利用所述预置语料断句函数与所述预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,所述第二语句检测结果为所述第二语句中每个词语之间的目标依存关系;判断所述第二语句检测结果是否仅包括单实体且所述第二语句为疑问句;当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义缺失部分补齐,得到第一补齐语句;判断所述第一补齐语句中是否包括指代不明的词语,所述指代不明的词语包括代词、量词以及冠词;若所述第一补齐语句中包括指代不明的词语,则根据所述第一语句检测结果将所述第一补齐语句中指代不明的词语替换,得到第二补齐语句。The third aspect of the embodiments of the present application provides a device for complementing semantics in multiple rounds of dialogue, including a memory, a processor, and a computer program stored in the memory and running on the processor. The processing When the computer program is executed by the computer program, the method based on semantic completion in multiple rounds of dialogues described in any of the above embodiments is implemented. The input first sentence is grammatically checked, and the first sentence detection result is obtained. The first sentence detection result is the target dependency relationship between each word in the first sentence; The preset analysis function performs grammatical detection on the acquired second sentence inputted by the user in a new round to obtain the second sentence detection result. The second sentence detection result is the difference between each word in the second sentence Target dependency relationship; determine whether the second sentence detection result includes only a single entity and the second sentence is a question sentence; when the second sentence detection result only includes a single entity and the second sentence is a question sentence, according to the According to the detection result of the first sentence, the missing part of the semantics of the second sentence is added to obtain the first supplementary sentence; it is judged whether the first supplementary sentence includes an unclear word, and the unclear word refers to it. Including pronouns, quantifiers, and articles; if the first supplementary sentence includes an unknown word, replace the unknown word in the first supplementary sentence according to the first sentence detection result to obtain the first supplementary sentence Two complete sentences.
本申请实施例的第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面所述的方法,例如,实现以下步骤:利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一语句中每个词语之间的目标依存关系;利用所述预置语料断句函数与所述预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,所述第二语句检测结果为所述第二语句中每个词语之间的目标依存关系;判断所述第二语句检测结果是否仅包括单实体且所述第二语句为疑问句;当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义缺失部分补齐,得到第一补齐语句;判断所述第一补齐语句中是否包括指代不明的词语,所述指代不明的词语包括代词、量词以及冠词;若所述第一补齐语句中包括指代不明的词语,则根据所述第一语句检测结果将所述第一补齐语句中指代不明的词语替换,得到第二补齐语句。The fourth aspect of the embodiments of the present application provides a computer-readable storage medium that stores instructions in the computer-readable storage medium, which when run on a computer, causes the computer to execute the method described in the first aspect, For example, the following steps are implemented: using the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, and the first sentence detection result is the first sentence The target dependency relationship between each word in a sentence; using the preset corpus sentence segmentation function and the preset analysis function to perform a grammatical check on the acquired second sentence input by the user in a new round to obtain the second sentence The detection result, the second sentence detection result is the target dependency relationship between each word in the second sentence; it is judged whether the second sentence detection result only includes a single entity and the second sentence is a question sentence; when When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, fill in the semantic missing part of the second sentence according to the first sentence detection result to obtain the first supplementary sentence; Whether the first supplementary sentence includes unclear words, the unclear words include pronouns, quantifiers, and articles; if the first supplementary sentence includes unclear words, it will be determined according to the According to the detection result of the first sentence, the unknown words in the first supplementary sentence are replaced to obtain the second supplementary sentence.
本申请实施例提高了语义分析结果的准确率,同时也提高了根据语义解析结果搜索对应的应答信息的准确率。The embodiments of the present application improve the accuracy of the semantic analysis result, and at the same time also improve the accuracy of searching the corresponding response information according to the semantic analysis result.
附图说明Description of the drawings
图1为本申请多轮对话中语义补齐的方法的一个实施例示意图;FIG. 1 is a schematic diagram of an embodiment of a method for semantic completion in multiple rounds of dialogue in this application;
图2为本申请多轮对话中语义补齐的方法的另一个实施例示意图;FIG. 2 is a schematic diagram of another embodiment of a method for semantic completion in multiple rounds of dialogue in this application;
图3为本申请多轮对话中语义补齐的装置的一个实施例示意图;FIG. 3 is a schematic diagram of an embodiment of a device for complementing semantics in multiple rounds of dialogue in this application;
图4为本申请多轮对话中语义补齐的装置的另一个实施例示意图;FIG. 4 is a schematic diagram of another embodiment of a device for complementing semantics in multiple rounds of dialogue in this application;
图5本申请多轮对话中语义补齐的设备的一个实施例示意图。Fig. 5 is a schematic diagram of an embodiment of a device for semantic completion in multiple rounds of dialogues of the present application.
具体实施方式Detailed ways
本申请的提供一种多轮对话中语义补齐的方法,用于解决基于多轮对话下,语句语义不全的问题,提高了语义分析结果的准确率,同时也提高了根据语义解析结果搜索对应的应答信息的准确率。The present application provides a method of semantic complement in multiple rounds of dialogue, which is used to solve the problem of incomplete sentence semantics based on multiple rounds of dialogue, improve the accuracy of semantic analysis results, and also improve the search for correspondence based on semantic analysis results. The accuracy of the response information.
本申请的技术方案可应用于人工智能领域或大数据技术领域,例如本申请的技术方案可通过数据平台如云计算平台实现。The technical solution of this application can be applied to the field of artificial intelligence or big data technology. For example, the technical solution of this application can be implemented by a data platform such as a cloud computing platform.
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。In order to enable those skilled in the art to better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be described below in conjunction with the drawings in the embodiments of the present application.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四” 等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects, and do not need to be used. To describe a specific order or sequence. It should be understood that the data used in this way can be interchanged under appropriate circumstances so that the embodiments described herein can be implemented in a sequence other than the content illustrated or described herein. In addition, the terms "including" or "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not necessarily limited to those clearly listed. Steps or units, but may include other steps or units that are not clearly listed or are inherent to these processes, methods, products, or equipment.
请参阅图1,本申请实施例中多轮对话中语义补齐的方法一个实施例包括:Referring to FIG. 1, an embodiment of the method for semantic completion in multiple rounds of dialogue in the embodiment of the present application includes:
101、利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,第一语句检测结果为第一语句中每个词语之间的目标依存关系。101. Use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the first sentence input by the user, and obtain the first sentence detection result. The first sentence detection result is between each word in the first sentence The goal dependence relationship.
服务器利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,第一语句检测结果为第一语句中每个词语之间的目标依存关系。The server uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the first sentence input by the user, and obtain the first sentence detection result. The first sentence detection result is the difference between each word in the first sentence. Target dependency.
需要说明的是,服务器利用预置语料断句函数对第一语句进行断句,以及利用预置分析函数对第一输入语句进行语法检测,均为补齐第二语句语义做准备。服务器利用预置语料断句函数对第一语句进行初步断句,断句是自然语言处理的基础,断句的准确度直接决定了服务器后面进行句子的词性标注、句法分析、词向量以及文本分析的质量。对于简单的第一语句可以直接用逗号进行断句,服务器将第一语句中的字,按照在语句中的位置进行标注,举例来说:对句子“我爱深圳因为它很美”进行标注,标注的结果为:“我N爱N深N圳Y因N为N它N很N美N”,其中Y代表断句,N代表非断句,可以视为服务器对第一语句中每个字与字之间的位置做二分类,判断第一语句是否应该断句。It should be noted that the server uses the preset corpus sentence segmentation function to segment the first sentence and uses the preset analysis function to perform grammatical check on the first input sentence, both of which are preparations for complementing the semantics of the second sentence. The server uses the preset corpus sentence segmentation function to preliminarily segment the first sentence. The sentence segmentation is the basis of natural language processing. The accuracy of the sentence segmentation directly determines the quality of the part-of-speech tagging, syntactic analysis, word vector, and text analysis of the sentence behind the server. For a simple first sentence, you can directly use a comma to segment the sentence. The server will mark the words in the first sentence according to their position in the sentence. For example: mark the sentence "I love Shenzhen because it is beautiful". The result is: "I N love N Shenzhen Y because N is N, it is N is very beautiful N", where Y stands for sentence segmentation, N stands for non-sentence segmentation, which can be regarded as the server's response to each word in the first sentence. The position between the two is classified, and judge whether the first sentence should be segmented.
需要说明的是,这里预置分析函数指的是依存分析函数,服务器利用依存分析函数对第一语句进行语法检测,依存分析函数是利用语句中单词之间的依存关系来表达语法,如:一个词语修饰另一个词语,则称该词语依赖于另一个词语,在句子中明确了词语之间的关系,服务器可以准确的判断出语句中每个词语的词性以及词语之间的关系,为对语句进行补齐做充分的准备。服务器在进行依存分析前,需要对语句进行词性标注,服务器将第一语句中的字,按照在词语中的位置进行标注,可以利用BMES标注法,其中,B为一个词语开始的一个字,M为一个词语中间的字,词语中间的字数可能为多个,E为一个词语最后的一个字,S为由一个字组成的词语。举例来说,将“网商银行是蚂蚁金服微贷事业部的最重要产品”进行标注,标注后结果为“BMMESBMMEBMMMESBMEBE”,对应的分词结果为“网商银行/是/蚂蚁金服/微贷事业部/的/最重要/产品”,在预置词语库中查找对应分词的词性,将分词结果进行词性标注,当句子中出现单实体(句子中仅包括主语、谓语或宾语)时,将单实体抽取出来并保存在第一语句检测结果中以备用。It should be noted that the preset analysis function here refers to the dependency analysis function. The server uses the dependency analysis function to check the grammar of the first sentence. The dependency analysis function uses the dependency relationship between the words in the sentence to express the grammar, such as: a When a word modifies another word, it is said that the word depends on another word, and the relationship between the words is clarified in the sentence. The server can accurately determine the part of speech of each word in the sentence and the relationship between the words, which is a paired sentence Make full preparations to make up. Before performing dependency analysis, the server needs to tag the sentence part-of-speech. The server tags the words in the first sentence according to their position in the word. The BMES tagging method can be used, where B is a word beginning with a word, and M It is a word in the middle of a word, the number of words in the middle of a word may be multiple, E is the last word of a word, and S is a word composed of one word. For example, mark "Online Merchant Bank is the most important product of Ant Financial's microfinance division", the result after labeling is "BMMESBMMEBMMMESBMEBE", and the corresponding word segmentation result is "Online Merchant Bank/Yes/Ant Financial Services/WeChat Loan Division/The/Most Important/Product", search for the part of speech of the corresponding word segmentation in the preset vocabulary, and mark the word segmentation result with part of speech. When a single entity appears in the sentence (the sentence includes only the subject, predicate or object), Extract the single entity and save it in the first sentence detection result for future use.
102、利用预置语料断句函数与预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,第二语句检测结果为第二语句中每个词语之间的目标依存关系。102. Use the preset corpus sentence segmentation function and the preset analysis function to perform a grammatical check on the acquired second sentence input by the user in a new round, and obtain the second sentence detection result. The second sentence detection result is each of the second sentences. The goal dependence relationship between words.
服务器利用预置语料断句函数与预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,第二语句检测结果为第二语句中每个词语之间的目标依存关系。The server uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in the new round, and obtain the second sentence detection result. The second sentence detection result is each word in the second sentence The goal dependence relationship between.
这里的第二语句是第一语句的下文语句,第一语句作为补齐第二语句语义缺失部分的依据。这里对第二语句进行语法检测的方式与对第一语句进行语法检测的方式是相同的,将检测的结果作为第二语句检测结果。The second sentence here is the following sentence of the first sentence, and the first sentence serves as the basis for filling in the missing part of the second sentence. Here, the grammatical detection method for the second sentence is the same as the grammatical detection method for the first sentence, and the detection result is used as the second sentence detection result.
103、判断第二语句检测结果是否仅包括单实体且第二语句为疑问句。103. Determine whether the detection result of the second sentence includes only a single entity and the second sentence is an interrogative sentence.
服务器判断第二语句检测结果是否仅包括单实体且第二语句为疑问句。The server determines whether the detection result of the second sentence includes only a single entity and the second sentence is an interrogative sentence.
可以理解的是,用户新一轮输入的第二语句可能出现语句部分省略的情况,比如省略谓语与宾语,这时就需要服务器对新一轮输入的第二语句进行判断,若新一轮输入的第二语句中语义表达不全,则根据多轮对话中的上下文情景将新一轮输入的第二语句补充完整,因此,服务器首先判断第二语句分析结果中是否仅包含单实体且第二语句为疑问句,以此来判断新一轮输入的第二语句是否满足语义不全的条件。It is understandable that part of the sentence may be omitted in the second sentence input by the user in the new round, such as omitting the predicate and object. At this time, the server needs to judge the second sentence input in the new round. If the semantic expression of the second sentence is not complete, the second sentence entered in the new round will be completed according to the context in the multiple rounds of dialogue. Therefore, the server first determines whether the analysis result of the second sentence contains only a single entity and the second sentence It is an interrogative sentence, which is used to determine whether the second sentence entered in the new round satisfies the condition of semantic incompleteness.
104、当第二语句检测结果仅包括单实体且第二语句为疑问句时,根据第一语句检测结果将第二语句的语义缺失部分补齐,得到第一补齐语句。104. When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, fill in the semantic missing part of the second sentence according to the first sentence detection result to obtain the first supplementary sentence.
当第二语句检测结果仅包括单实体且第二语句为疑问句时,服务器根据第一语句检测结果将第二语句的语义缺失部分补齐,得到第一补齐语句。When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, the server fills in the semantic missing part of the second sentence according to the first sentence detection result to obtain the first supplementary sentence.
若第二语句检测结果中包含单实体且第二语句为疑问句,则服务器不能准确识别第二语句的含义,无法反馈相应的预置回应,因此服务器根据获取到的上下文情景将第二语句补齐,得到一个完整的第二语句的语义,服务器再根据第二语句的语义反馈相应的回答。服务器获取到第一语句检测结果与第二语句检测结果,两个检测结果中均包括对句子的词性分析,提取出第二语句中的单实体,作为第一目标词语,这里的单实体指的是句子中的单个字或单个词组,例如:句子“感冒呢?”,其中“感冒”一词即为该句子中的单实体。待提取出第二语句中的单实体后,在第一语句中筛选与第一目标词语语法结构相同的单实体,作为第二目标词语,这里的语法结构相同指的是第一目标词语与第二目标词语的词性相同,例如:第一目标词语为“感冒”,第二目标词语为“癌症”,第一目标词语与第二目标词语均为名词词性,可以说两者的语法结构相同。待服务器筛选出第二目标词语后,在第一语句中,服务器将第一目标词语替换为第二目标词语,得到第一补齐语句。If the detection result of the second sentence contains a single entity and the second sentence is an interrogative sentence, the server cannot accurately identify the meaning of the second sentence and cannot feed back the corresponding preset response. Therefore, the server fills in the second sentence according to the acquired context , A complete semantics of the second sentence is obtained, and the server feeds back the corresponding answer according to the semantics of the second sentence. The server obtains the first sentence detection result and the second sentence detection result. Both detection results include part-of-speech analysis of the sentence, and extract the single entity in the second sentence as the first target word, where the single entity refers to It is a single word or a single phrase in a sentence, for example: the sentence "What about a cold?", where the word "a cold" is a single entity in the sentence. After extracting the single entity in the second sentence, select the single entity with the same grammatical structure as the first target word in the first sentence as the second target word. The same grammatical structure here refers to the first target word and the first target word. The two target words have the same part of speech, for example: the first target word is "cold", the second target word is "cancer", the first target word and the second target word are both noun parts of speech, it can be said that the two have the same grammatical structure. After the server filters out the second target words, in the first sentence, the server replaces the first target words with the second target words to obtain the first supplementary sentence.
105、判断第一补齐语句中是否包括指代不明的词语,指代不明的词语包括代词、量词以及冠词。105. Determine whether the first supplementary sentence includes unidentified words, which include pronouns, quantifiers, and articles.
服务器判断第一补齐语句中是否包括指代不明的词语,指代不明的词语包括代词、量词以及冠词。The server determines whether the first supplementary sentence includes unclear words, which include pronouns, quantifiers, and articles.
需要说明的是,当服务器补齐新一轮输入的第二语句且得到第一补齐语句时,还要确定第一补齐语句中是否包括指代不明的词语,指代不明的词语会导致第二语句的语义不明,一般来讲,指代不明的词语包括代词、量词以及冠词。举例来说:用户的第一语句为“平安福、福满分都是医疗险么?”服务器回复第一语句为“是的。”用户新一轮输入的第二语句为“第一种能报销什么?”在用户新一轮输入的第二语句中存在指代不明的词语“第一种”,如果服务器不联系多轮对话的上下文来对用户新一轮输入的第二语句进行补充,则会导致语义理解偏差,服务器将不能够给出正确的回答,导致用户不能够继续进行下一步的操作。It should be noted that when the server completes a new round of input of the second sentence and obtains the first supplementary sentence, it must also determine whether the first supplementary sentence includes unclear words, which will lead to The semantics of the second sentence is unclear. Generally speaking, unclear words include pronouns, quantifiers and articles. For example: the first sentence of the user is "Are both Ping An Fu and Fu Tianxin medical insurance?" The server replies to the first sentence as "Yes." The second sentence entered by the user in a new round is "The first type can be reimbursed." What?" In the second sentence entered by the user in the new round, there is an unknown term "the first type". If the server does not contact the context of multiple rounds of dialogue to supplement the second sentence entered by the user in the new round, then It will lead to a deviation in semantic understanding, the server will not be able to give the correct answer, and the user will not be able to proceed to the next step.
106、若第一补齐语句中包括指代不明的词语,则根据第一语句检测结果将第一补齐语句中指代不明的词语替换,得到第二补齐语句。106. If the first supplementary sentence includes an unknown word, replace the unknown word in the first supplementary sentence according to the detection result of the first sentence to obtain a second supplementary sentence.
若第一补齐语句中包括指代不明的词语,则服务器根据第一语句检测结果将第一补齐语句中指代不明的词语替换,得到第二补齐语句。If the first supplementary sentence includes an unknown word, the server replaces the unknown word in the first supplementary sentence according to the detection result of the first sentence to obtain a second supplementary sentence.
服务器若确定第一补齐语句中包括指代不明的词语,说明第一补齐语句为语义不明的语句,则服务器需要根据多轮对话中的上下文情景对第一补齐语句进行再次补齐。首先,服务器在将第二语句中提取出指代不明的词语,作为第三目标词语,其次,在第一语句中筛选与第三目标词语词性相同的词语,作为第四目标词语,最后,在第二语句中,将第四目标词替换为第三目标词语。If the server determines that the first supplementary sentence includes an unknown word, indicating that the first supplementary sentence is a sentence with ambiguous semantics, the server needs to supplement the first supplementary sentence again according to the context in multiple rounds of dialogue. First, the server extracts the unknown words from the second sentence as the third target word. Secondly, it selects words with the same part of speech as the third target word in the first sentence as the fourth target word. In the second sentence, replace the fourth target word with the third target word.
本申请实施例,通过利用预置语料断句函数与预置分析函数对用户输入的多轮对话进行语法检测,以及利用多轮对话上下文的语义对语义不全的语句进行补齐,提高了语义分析结果的准确率,同时也提高了根据语义解析结果搜索对应的应答信息的准确率。In the embodiment of the present application, by using the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on multiple rounds of dialogue input by the user, and using the semantics of the multiple rounds of dialogue context to complement sentences with incomplete semantics, the results of semantic analysis are improved. At the same time, the accuracy of searching the corresponding response information based on the semantic analysis result is also improved.
请参阅图2,本申请实施例中多轮对话中语义补齐的方法另一个实施例包括:Referring to FIG. 2, another embodiment of the method for semantic completion in multiple rounds of dialogue in the embodiment of the present application includes:
201、获取用户输入的第一语句。201. Obtain the first sentence input by the user.
服务器获取用户输入的第一语句。在人机交互中,服务器获取用户输入的语句,并将用户输入的语句作为第一语句,为下文第二语句的补齐做准备。The server obtains the first sentence entered by the user. In human-computer interaction, the server obtains the sentence input by the user, and uses the sentence input by the user as the first sentence to prepare for the completion of the second sentence below.
202、利用预置语料断句函数将第一语句进行语句分割,得到第一输入语句。202. Use a preset corpus sentence segmentation function to segment the first sentence to obtain the first input sentence.
服务器利用预置语料断句函数将第一语句进行语句分割,得到第一输入语句。具体的,服务器对第一语句进行断句,得到分段语句;服务器将分段语句中的分段语料与预置语料进行匹配,预置语料是依据业务数据建立在预置意图规则库中的语料;若分段语料与预置语料相匹配,则服务器在分段语料的前后位置对分段语句进行分割,得到分割语句,并将分割语句作为第一输入语句;若分段语料与预置语料不相配,则服务器直接将分段语料作为第一输入语句。The server uses the preset corpus sentence segmentation function to segment the first sentence to obtain the first input sentence. Specifically, the server segments the first sentence to obtain a segmented sentence; the server matches the segmented corpus in the segmented sentence with a preset corpus, which is a corpus established in a preset intent rule base based on business data ; If the segmented corpus matches the preset corpus, the server will segment the segmented sentence before and after the segmented corpus to obtain the segmented sentence, and use the segmented sentence as the first input sentence; if the segmented corpus matches the preset corpus If it does not match, the server directly uses the segmented corpus as the first input sentence.
需要说明的是,服务器利用预置语料断句函数对第一语句进行断句,以及利用预置分析函数对第一输入语句进行语法检测,均为补齐第二语句语义做准备。服务器利用预置语料断句函数对第一语句进行初步断句,断句是自然语言处理的基础,断句的准确度直接决定服务器后面进行词性标注、句法分析、词向量以及文本分析的质量。对于简单的第一语句可直接用逗号进行断句,服务器将第一语句中的字,按照在语句中的位置进行标注,举例来说:对句子“我爱深圳因为它很美”进行标注,标注的结果为:“我N爱N深N圳Y因N为N它N很N美N”,其中Y代表断句,N代表非断句,可视为服务器对第一语句中每个字与字之间的位置做二分类,判断是否应该断句。It should be noted that the server uses the preset corpus sentence segmentation function to segment the first sentence and uses the preset analysis function to perform grammatical check on the first input sentence, both of which are preparations for complementing the semantics of the second sentence. The server uses the preset corpus sentence segmentation function to preliminarily segment the first sentence. The sentence segmentation is the basis of natural language processing. The accuracy of sentence segmentation directly determines the quality of part-of-speech tagging, syntactic analysis, word vector and text analysis behind the server. For a simple first sentence, you can directly use a comma to segment the sentence. The server will mark the words in the first sentence according to their position in the sentence. For example: mark the sentence "I love Shenzhen because it is beautiful". The result is: "I N love N Shenzhen N Y because N is N, it is N is very N beautiful N", where Y stands for sentence segmentation, N stands for non-sentence segmentation. Make a two-class classification of the position between the two to determine whether the sentence should be segmented.
服务器再将第一语句进行断句后,得到分开的分段语句,再将分段语句中的分段语料与预置语料进行匹配,这里的分段语句是不能够在进行语句断句的语句,因此进一步对分段语句中的分段语料进行匹配,明确相应的语料是否需要进行分割。这里的预置语料是依据业务数据建立在预置意图规则库中的语料,如:预置意图规则库为保险库,预置语料可为保单、被保人身份证号等,是服务器依据相关业务数据构成语料。若分段语料与预置语料相匹配,则服务器在分段语料的前后位置对分段语句进行分割,得到分割语句,并将分割语句作为第一输入语句;若分段语料与预置语料不相配,则服务器直接将分段语料作为第一输入语句。举例来说:对句子“感冒可以投保e生保么?”进行分段语料匹配后分割成“感冒/可以/投保/e生保/么?”,其中“感冒”、“投保”、“e生保”均为预置意图规则库中的预置语料,服务器不用再对匹配到预置语料的分段语料进行分割,直接在匹配到的分段语料的前后位置上进行分割,节省了服务器对语句分割的时间。After segmenting the first sentence, the server obtains the separated segmented sentence, and then matches the segmented corpus in the segmented sentence with the preset corpus. The segmented sentence here is a sentence that cannot be segmented, so Further match the segmented corpus in the segmented sentence, and clarify whether the corresponding corpus needs to be segmented. The preset corpus here is the corpus built in the preset intent rule base based on business data. For example, the preset intent rule base is a vault, and the preset corpus can be an insurance policy, an ID number of the insured, etc., based on the server. Business data constitutes corpus. If the segmented corpus matches the preset corpus, the server will segment the segmented sentence before and after the segmented corpus to obtain the segmented sentence, and use the segmented sentence as the first input sentence; if the segmented corpus does not match the preset corpus Match, the server directly uses the segmented corpus as the first input sentence. For example: match the sentence "Can a cold be insured for e-life insurance?" After matching the segmented corpus, it is divided into "cold/can/insured/e-life insurance/?", where "cold", "insurance", "e-life insurance" All are the preset corpus in the preset intent rule library. The server does not need to segment the segmented corpus that matches the preset corpus, and directly divides the segmented corpus before and after the matched segmented corpus, saving the server to segment the sentence time.
203、利用预置分析函数对第一输入语句进行语法检测,得到第一语句检测结果,第一语句检测结果为第一输入语句中每个词语之间的目标依存关系。203. Perform grammatical detection on the first input sentence by using a preset analysis function to obtain a first sentence detection result. The first sentence detection result is a target dependency relationship between each word in the first input sentence.
服务器利用预置分析函数对第一输入语句进行语法检测,得到第一语句检测结果,第一语句检测结果为第一输入语句中每个词语之间的目标依存关系。具体的,服务器对第一输入语句中的词语进行词性标注以及实体抽取,得到第一语句标注结果;服务器计算每个词语之间的依存概率,词语为第一语句标注结果中的词语,依存概率为预置依存关系出现的频次;服务器确定每个词语之间的目标依存关系,依存概率权重最大的概率所对应的预置依存关系为词语之间的目标依存关系;服务器获取第一语句检测结果,第一语句检测结果为第一输入语句中每个词语之间的目标依存关系。The server uses a preset analysis function to perform grammatical detection on the first input sentence to obtain a first sentence detection result. The first sentence detection result is the target dependency relationship between each word in the first input sentence. Specifically, the server performs part-of-speech tagging and entity extraction on the words in the first input sentence to obtain the first sentence tagging result; the server calculates the dependency probability between each word, the word is the word in the first sentence tagging result, and the dependency probability It is the frequency of occurrence of the preset dependence relationship; the server determines the target dependence relationship between each word, and the preset dependence relationship corresponding to the probability with the largest dependence probability weight is the target dependence relationship between the words; the server obtains the first sentence detection result , The first sentence detection result is the target dependency relationship between each word in the first input sentence.
需要说明的是,这里预置分析函数指的是依存分析函数,服务器利用依存分析函数对第一语句进行语法检测,依存分析函数是利用语句中单词之间的依存关系来表达语法,如:一个词语修饰另一个词语,则称该词语依赖于另一个词语,在句子中明确了词语之间的关系,服务器可以准确的判断出语句中每个词语的词性以及词语之间的关系,为对语句进行补齐做充分的准备。服务器进行依存分析前,需要对语句进行词性标注,服务器将第一语句中的字,按照在词语中的位置进行标注,可以利用BMES标注法,其中,B为一个词语 开始的一个字,M为一个词语中间的字,词语中间的字数可能为多个,E为一个词语最后的一个字,S为由一个字组成的词语。举例来说,将“网商银行是蚂蚁金服微贷事业部的最重要产品”进行标注,标注后结果为“BMMESBMMEBMMMESBMEBE”,对应的分词结果为“网商银行/是/蚂蚁金服/微贷事业部/的/最重要/产品”,在预置词语库中查找对应分词的词性,将分词结果进行词性标注,当句子中出现单实体(句子中仅包括主语、谓语或宾语)时,将单实体抽取出来并保存在第一语句检测结果中以备用。It should be noted that the preset analysis function here refers to the dependency analysis function. The server uses the dependency analysis function to check the grammar of the first sentence. The dependency analysis function uses the dependency relationship between the words in the sentence to express the grammar, such as: a When a word modifies another word, it is said that the word depends on another word, and the relationship between the words is clarified in the sentence. The server can accurately determine the part of speech of each word in the sentence and the relationship between the words, which is a paired sentence Make full preparations to make up. Before the server performs dependency analysis, the sentence needs to be marked with part of speech. The server marks the words in the first sentence according to their position in the word. The BMES marking method can be used, where B is a word beginning with a word, and M is For a word in the middle of a word, the number of words in the middle of a word may be more than one. E is the last word of a word, and S is a word consisting of one word. For example, mark "Online Merchant Bank is the most important product of Ant Financial's microfinance division", the result after labeling is "BMMESBMMEBMMMESBMEBE", and the corresponding word segmentation result is "Online Merchant Bank/Yes/Ant Financial Services/WeChat Loan Division/The/Most Important/Product", search for the part of speech of the corresponding word segmentation in the preset vocabulary, and mark the word segmentation result with part of speech. When a single entity appears in the sentence (the sentence includes only the subject, predicate or object), Extract the single entity and save it in the first sentence detection result for future use.
204、利用预置语料断句函数与预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,第二语句检测结果为第二语句中每个词语之间的目标依存关系。204. Use the preset corpus sentence segmentation function and the preset analysis function to perform a grammatical check on the acquired second sentence input by the user in a new round to obtain the second sentence detection result, and the second sentence detection result is each of the second sentences. The goal dependence relationship between words.
服务器利用预置语料断句函数与预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,第二语句检测结果为第二语句中每个词语之间的目标依存关系。The server uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in the new round, and obtain the second sentence detection result. The second sentence detection result is each word in the second sentence The goal dependence relationship between.
这里的第二语句是第一语句的下文语句,第一语句作为补齐第二语句语义缺失部分的依据。这里对第二语句进行语法检测的方式与对第一语句进行语法检测的方式是相同的,将检测的结果作为第二语句检测结果。The second sentence here is the following sentence of the first sentence, and the first sentence serves as the basis for filling in the missing part of the second sentence. Here, the grammatical detection method for the second sentence is the same as the grammatical detection method for the first sentence, and the detection result is used as the second sentence detection result.
205、判断第二语句检测结果是否仅包括单实体且第二语句为疑问句。205. Determine whether the detection result of the second sentence includes only a single entity and the second sentence is an interrogative sentence.
服务器判断第二语句检测结果是否仅包括单实体且第二语句为疑问句。The server determines whether the detection result of the second sentence includes only a single entity and the second sentence is an interrogative sentence.
可以理解的是,用户新一轮输入的第二语句可能出现语句部分省略的情况,比如省略谓语与宾语,这时就需要服务器对新一轮输入的第二语句进行判断,若新一轮输入的第二语句中语义表达不全,则根据多轮对话中的上下文情景将新一轮输入的第二语句补充完整,因此,服务器首先判断第二语句分析结果中是否仅包含单实体且第二语句为疑问句,来判断新一轮输入的第二语句是否满足语义不全的条件。It is understandable that part of the sentence may be omitted in the second sentence input by the user in the new round, such as omitting the predicate and object. At this time, the server needs to judge the second sentence input in the new round. If the semantic expression of the second sentence is not complete, the second sentence entered in the new round will be completed according to the context in the multiple rounds of dialogue. Therefore, the server first determines whether the analysis result of the second sentence contains only a single entity and the second sentence It is an interrogative sentence to judge whether the second sentence entered in the new round satisfies the condition of semantic incompleteness.
206、当第二语句检测结果仅包括单实体且第二语句为疑问句时,根据第一语句检测结果将第二语句的语义缺失部分补齐,得到第一补齐语句。206. When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, fill in the semantic missing part of the second sentence according to the first sentence detection result to obtain the first supplementary sentence.
当第二语句检测结果仅包括单实体且第二语句为疑问句时,服务器根据第一语句检测结果将第二语句的语义补全,得到第一补齐语句。具体的,当第二语句检测结果仅包括单实体且第二语句为疑问句时,服务器提取出第二语句检测结果中的单实体,并将单实体作为第一目标词语;服务器在第一语句检测结果中筛选与第一目标词语语法结构相同的单实体,得到第二目标词语;在第一语句中,服务器利用第一目标词语替换第二目标词语,得到第一补齐语句。When the second sentence detection result includes only a single entity and the second sentence is a question sentence, the server completes the semantics of the second sentence according to the first sentence detection result to obtain the first supplementary sentence. Specifically, when the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, the server extracts the single entity in the second sentence detection result and uses the single entity as the first target word; the server detects in the first sentence In the result, single entities with the same grammatical structure as the first target word are selected to obtain the second target word; in the first sentence, the server replaces the second target word with the first target word to obtain the first supplementary sentence.
若第二语句检测结果中仅包含单实体且第二语句为疑问句,则服务器不能准确识别第二语句的含义,无法反馈相应的预置回应,因此服务器根据获取到的上下文情景将第二语句的语义缺失部分补齐,得到完整的第二语句的语义,服务器再根据第二语句的语义反馈相应的回答。服务器获取到第一语句检测结果与第二语句检测结果,两个检测结果中均包括对句子的词性分析,提取出第二语句中的单实体,作为第一目标词语,并在第一语句检测结果中筛选与第一目标词语语法结构相同的单实体,作为第二目标词语,在第一语句中,服务器将第一目标词语替换为第二目标词语,得到第一补齐语句。If the detection result of the second sentence contains only a single entity and the second sentence is a question sentence, the server cannot accurately identify the meaning of the second sentence, and cannot feed back the corresponding preset response. Therefore, the server will determine the meaning of the second sentence according to the acquired context. The semantic missing parts are filled in to obtain the complete semantics of the second sentence, and the server then feeds back the corresponding answer according to the semantics of the second sentence. The server obtains the first sentence detection result and the second sentence detection result. Both detection results include part-of-speech analysis of the sentence, extract the single entity in the second sentence as the first target word, and detect it in the first sentence In the result, a single entity with the same grammatical structure as the first target word is selected as the second target word. In the first sentence, the server replaces the first target word with the second target word to obtain the first supplementary sentence.
举例来说:用户输入的第一语句为“感冒可以投保e生保么?”,服务器对用户输入的第一语句进行词性标注与依存分析,得到“感冒/nhd 可以/c 投保/vn e生保/nbx 么/y?”这里的nhd以及nbx是预置语料的相关词性,例如:预置语料所在的预置意图规则库为保险库,nhd代表保险库中的疾病一类词,nbx则代表保险库中保险名称一类词;c代表助动词;vn代表动名词;y代表语气助词,提取出来的单实体为:感冒/nhd 投保/vn e生保/nbx。用户输入新一轮的第二语句为“癌症哪?”服务器对这句话进行解析,得到“癌症/nhd哪/y”很明显第二语句检测结果中仅包括单实体且第二语句以反问语气结尾,满足第二语句语义 不全的条件,则服务器对第二语句进行语义补充,根据第一语句检测结果以及第二语句检测结果,得到感冒/nhd与癌症/nhd是同一类词语,将两者在第一语句中互换,得到“癌症可以投保e生保么?”将第二语句补齐。For example: the first sentence entered by the user is "Can I apply for insurance for a cold?", and the server performs part-of-speech tagging and dependency analysis on the first sentence input by the user, and obtains "Cold/nhd Yes/c Apply for insurance/vn e生保/ nbx what/y?” where nhd and nbx are related parts of speech of the preset corpus, for example: the preset intent rule base where the preset corpus is located is the vault, nhd represents the word of disease in the vault, and nbx represents insurance Words like insurance names in the library; c stands for auxiliary verbs; vn stands for gerunds; y stands for modal auxiliary words. The extracted single entity is: cold/nhd insurance/vn e生保/nbx. The user enters a new round of the second sentence as "Cancer?" The server parses this sentence and gets "Cancer/nhd which/y". It is obvious that the second sentence only includes a single entity and the second sentence is a rhetorical question. At the end of the tone, if the semantic incomplete condition of the second sentence is met, the server will supplement the second sentence. According to the detection result of the first sentence and the detection result of the second sentence, it is obtained that cold/nhd and cancer/nhd are the same type of words. In the first sentence, you can get "Can cancer be insured for e-life insurance?" and fill in the second sentence.
207、判断第一补齐语句中是否包括指代不明的词语,指代不明的词语包括代词、量词以及冠词。207. Determine whether the first supplementary sentence includes unclear words, which include pronouns, quantifiers, and articles.
服务器判断第一补齐语句中是否包括指代不明的词语,指代不明的词语包括代词、量词以及冠词。具体的,服务器获取第一补齐语句的检测结果,第一补齐语句的检测结果为第二语句检测结果与第一语句检测结果的结合;服务器判断第一补齐语句的检测结果中是否包括代词、量词以及冠词。The server determines whether the first supplementary sentence includes unclear words, which include pronouns, quantifiers, and articles. Specifically, the server obtains the detection result of the first supplementary sentence, the detection result of the first supplementary sentence is a combination of the second sentence detection result and the first sentence detection result; the server determines whether the detection result of the first supplementary sentence includes Pronouns, quantifiers, and articles.
需要说明的是,当服务器补齐新一轮输入的第二语句且得到第一补齐语句时,还要判断第一补齐语句中是否包括指代不明的词语,指代不明的词语会导致第二语句的语义不明,一般来讲,指代不明的词语包括代词、量词以及冠词。举例来说:用户的第一语句为“平安福、福满分都是医疗险么?”服务器回复第一语句为“是的。”用户新一轮输入的第二语句为“第一种能报销什么?”在用户新一轮输入的第二语句中存在指代不明的词语“第一种”,如果服务器不联系多轮对话的上下文来对用户新一轮输入的第二语句进行补充,则会导致语义理解偏差,服务器将不能够给出正确的回答,导致用户不能够继续进行下一步的操作。It should be noted that when the server completes a new round of input of the second sentence and obtains the first supplementary sentence, it must also determine whether the first supplementary sentence includes unclear words, which will lead to The semantics of the second sentence is unclear. Generally speaking, unclear words include pronouns, quantifiers and articles. For example: the first sentence of the user is "Are both Ping An Fu and Fu Dan for medical insurance?" The server replies to the first sentence as "Yes." The second sentence entered by the user in a new round is "The first type can be reimbursed." What?" In the second sentence entered by the user in the new round, there is an unknown term "the first type". If the server does not contact the context of multiple rounds of dialogue to supplement the second sentence entered by the user in the new round, then It will lead to a deviation in semantic understanding, the server will not be able to give the correct answer, and the user will not be able to proceed to the next step.
208、若第一补齐语句中包括指代不明的词语,则根据第一语句检测结果将第一补齐语句中指代不明的词语替换,得到第二补齐语句。208. If the first supplementary sentence includes an unknown word, replace the unknown word in the first supplementary sentence according to the detection result of the first sentence to obtain a second supplementary sentence.
若第一补齐语句中包括指代不明的词语,则服务器根据第一语句检测结果将第一补齐语句中指代不明的词语替换,得到第二补齐语句。具体的,若第一补齐语句的检测结果包括指代不明的词语,则服务器提取出指代不明的词语,并将指代不明的词语作为第三目标词语;服务器在第一语句检测结果中筛选与第三目标词语语法结构相同的词语,得到第四目标词语;在第二语句中,服务器利用第四目标词语替换第三目标词语,得到第二补齐语句。If the first supplementary sentence includes an unknown word, the server replaces the unknown word in the first supplementary sentence according to the detection result of the first sentence to obtain a second supplementary sentence. Specifically, if the detection result of the first supplementary sentence includes an unknown word, the server extracts the unknown word, and uses the unknown word as the third target word; the server is in the first sentence detection result The words with the same grammatical structure as the third target word are selected to obtain the fourth target word; in the second sentence, the server replaces the third target word with the fourth target word to obtain the second supplementary sentence.
服务器若确定第一补齐语句中包括指代不明的词语,说明第一补齐语句为语义不明的语句,则服务器需要根据多轮对话中的上下文情景对第一补齐语句再次补齐。首先,服务器在将第二语句中提取出指代不明的词语,作为第三目标词语,其次,在第一语句中筛选与第三目标词语词性相同的词语,作为第四目标词语,最后,在第二语句中将第四目标词替换为第三目标词语。If the server determines that the first supplementary sentence includes an unknown word, indicating that the first supplementary sentence is a sentence with ambiguous semantics, the server needs to supplement the first supplementary sentence again according to the context in multiple rounds of dialogue. First, the server extracts the unknown words from the second sentence as the third target word. Secondly, it selects words with the same part of speech as the third target word in the first sentence as the fourth target word. Replace the fourth target word with the third target word in the second sentence.
举例来说:用户的第一语句为“平安福、福满分,都是医疗险么?”,服务器对用户输入的第一语句进行词性标注与依存分析,得到“平安福/nbx、福满分/nbx,是/v医疗险/nbx么/y?”这里的nbx是预置语料的相关词性,例如:预置语料所在的预置意图规则库为保险库,nbx则代表保险库中保险名称一类词;v代表动词;y代表语气助词。用户新一轮输入的第二语句为“前者能报销什么?”,服务器对用户新一轮输入的第二语句进行词性标注与依存分析,得到“前者/r能/v报销/v什么/y?”这里的r代表代词;v代表动词;y代表语气助词,在用户新一轮输入的第二语句中存在指代不明的词语“前者”,服务器联系多轮对话的上下文来对用户新一轮输入的第二语句进行补充,结合用户第一语句的检测结果,这里的前者指代的是“平安福”,将“平安福”直接带入第二语句中,得到“平安福能报销什么?”得到完整的第二补齐语句,使得服务器能够根据第二补齐语句进行反馈与回答。For example: the first sentence of the user is "Ping An Fu and Fu Yuan, are they all medical insurance?", the server performs part-of-speech tagging and dependency analysis on the first sentence input by the user, and obtains "Ping An Fu/nbx, Fu Yuan/ nbx, is it /vMedical Insurance/nbx/y?” where nbx is the relevant part of speech of the preset corpus, for example: the preset intent rule library where the preset corpus is located is the vault, and nbx represents the name of the insurance in the vault. Class words; v stands for verbs; y stands for modal particles. The second sentence entered by the user in the new round is "What can the former be reimbursed?", the server performs part-of-speech tagging and dependency analysis on the second sentence entered by the user in the new round, and obtains "the former/r能/v reimbursement/vwhat/y "?" where r stands for pronouns; v stands for verbs; y stands for modal auxiliary words. In the second sentence input by the user in the new round, there is an unidentified word "former". The second sentence entered in the round is supplemented, combined with the detection result of the user's first sentence, the former here refers to "Ping An Fu", and "Ping An Fu" is directly brought into the second sentence to get "What is Ping An Fu reimbursed?" ?" Get a complete second supplementary sentence, so that the server can give feedback and answers based on the second supplementary sentence.
本申请实施例,通过利用预置语料断句函数与预置分析函数对用户输入的多轮对话进行语法检测,以及利用多轮对话上下文的语义对语义不全的语句进行补齐,提高了语义分析结果的准确率,同时也提高了根据语义解析结果搜索对应的应答信息的准确率。In the embodiment of the present application, by using the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on multiple rounds of dialogue input by the user, and using the semantics of the multiple rounds of dialogue context to complement sentences with incomplete semantics, the results of semantic analysis are improved. At the same time, the accuracy of searching the corresponding response information based on the semantic analysis result is also improved.
上面对本申请实施例中多轮对话中语义补齐的方法进行了描述,下面对本申请实施例中多轮对话中语义补齐的装置进行描述,请参阅图3,本申请实施例中多轮对话中语义补 齐的装置一个实施例包括:The above describes the method of semantic completion in multiple rounds of dialogue in the embodiment of this application. The following describes the device for semantic completion in multiple rounds of dialogue in the embodiment of this application. Please refer to FIG. 3. An embodiment of the device for semantic completion in the middle includes:
第一获取单元301,用于利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,第一语句检测结果为第一语句中每个词语之间的目标依存关系;The first acquiring unit 301 is configured to perform grammatical detection on the acquired first sentence input by the user by using the preset corpus sentence segmentation function and the preset analysis function to obtain the first sentence detection result, and the first sentence detection result is the first sentence The target dependence relationship between each word in
第二获取单元302,用于利用预置语料断句函数与预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,第二语句检测结果为第二语句中每个词语之间的目标依存关系;The second acquisition unit 302 is configured to use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence inputted by the user in a new round to obtain the second sentence detection result, and the second sentence detection result is The target dependence relationship between each word in the second sentence;
第一判断单元303,用于判断第二语句检测结果是否仅包括单实体且第二语句为疑问句;The first judgment unit 303 is configured to judge whether the detection result of the second sentence includes only a single entity and the second sentence is a question sentence;
第一补齐单元304,用于当第二语句检测结果仅包括单实体且第二语句为疑问句时,根据第一语句检测结果将第二语句的语义缺失部分补齐,得到第一补齐语句;The first complementing unit 304 is used for when the second sentence detection result only includes a single entity and the second sentence is a question sentence, according to the first sentence detection result, the semantic missing part of the second sentence is complemented to obtain the first complementary sentence ;
第二判断单元305,用于判断第一补齐语句中是否包括指代不明的词语,指代不明的词语包括代词、量词以及冠词;The second judging unit 305 is used to judge whether the first supplementary sentence includes unclear words, which include pronouns, quantifiers, and articles;
第二补齐单元306,用于若第一补齐语句中包括指代不明的词语,则根据第一语句检测结果将第一补齐语句中指代不明的词语替换,得到第二补齐语句。The second supplementary unit 306 is configured to, if the first supplementary sentence includes an unknown word, replace the unknown word in the first supplementary sentence according to the first sentence detection result to obtain a second supplementary sentence.
本申请实施例中,第一获取单元301利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,第一语句检测结果为第一语句中每个词语之间的目标依存关系;第二获取单元302利用预置语料断句函数与预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,第二语句检测结果为第二语句中每个词语之间的目标依存关系;第一判断单元303判断第二语句检测结果是否仅包括单实体且第二语句为疑问句;第一补齐单元304当第二语句检测结果仅包括单实体且第二语句为疑问句时,根据第一语句检测结果将第二语句的语义缺失部分补齐,得到第一补齐语句;第二判断单元305判断第一补齐语句中是否包括指代不明的词语,指代不明的词语包括代词、量词以及冠词;第二补齐单元306若第一补齐语句中包括指代不明的词语,则根据第一语句检测结果将第一补齐语句中指代不明的词语替换,得到第二补齐语句。In the embodiment of the present application, the first acquiring unit 301 uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, and the first sentence detection result is The target dependency relationship between each word in the first sentence; the second acquiring unit 302 uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in the new round, and obtain the first sentence The second sentence detection result, the second sentence detection result is the target dependency relationship between each word in the second sentence; the first judgment unit 303 judges whether the second sentence detection result only includes a single entity and the second sentence is a question sentence; The complementing unit 304, when the detection result of the second sentence only includes a single entity and the second sentence is an interrogative sentence, fills in the missing part of the semantics of the second sentence according to the detection result of the first sentence to obtain the first complementary sentence; the second judgment unit 305 Determine whether the first supplementary sentence includes unknown words, which include pronouns, quantifiers, and articles; the second supplement unit 306, if the first supplementary sentence includes unknown words, then According to the detection result of the first sentence, the unknown words in the first supplementary sentence are replaced to obtain the second supplementary sentence.
本申请实施例,通过利用预置语料断句函数与预置分析函数对用户输入的多轮对话进行语法检测,以及利用多轮对话上下文的语义对语义不全的语句进行补齐,提高了语义分析结果的准确率,同时也提高了根据语义解析结果搜索对应的应答信息的准确率。In the embodiment of the present application, by using the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on multiple rounds of dialogue input by the user, and using the semantics of the multiple rounds of dialogue context to complement sentences with incomplete semantics, the results of semantic analysis are improved. At the same time, the accuracy of searching the corresponding response information based on the semantic analysis result is also improved.
请参阅图4,本申请实施例中多轮对话中语义补齐的装置另一个实施例包括:Referring to FIG. 4, another embodiment of the device for semantic complement in multiple rounds of dialogue in the embodiment of the present application includes:
第一获取单元301,用于利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,第一语句检测结果为第一语句中每个词语之间的目标依存关系;The first acquiring unit 301 is configured to use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, and the first sentence detection result is the first sentence The target dependence relationship between each word in
第二获取单元302,用于利用预置语料断句函数与预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,第二语句检测结果为第二语句中每个词语之间的目标依存关系;The second acquisition unit 302 is configured to use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence inputted by the user in a new round to obtain the second sentence detection result, and the second sentence detection result is The target dependence relationship between each word in the second sentence;
第一判断单元303,用于判断第二语句检测结果是否仅包括单实体且第二语句为疑问句;The first judgment unit 303 is configured to judge whether the detection result of the second sentence includes only a single entity and the second sentence is a question sentence;
第一补齐单元304,用于当第二语句检测结果仅包括单实体且第二语句为疑问句时,根据第一语句检测结果将第二语句的语义缺失部分补齐,得到第一补齐语句;The first complementing unit 304 is used for when the second sentence detection result only includes a single entity and the second sentence is a question sentence, according to the first sentence detection result, the semantic missing part of the second sentence is complemented to obtain the first complementary sentence ;
第二判断单元305,用于判断第一补齐语句中是否包括指代不明的词语,指代不明的词语包括代词、量词以及冠词;The second judging unit 305 is used to judge whether the first supplementary sentence includes unclear words, which include pronouns, quantifiers, and articles;
第二补齐单元306,用于若第一补齐语句中包括指代不明的词语,则根据第一语句检测结果将第一补齐语句中指代不明的词语替换,得到第二补齐语句。The second supplementary unit 306 is configured to, if the first supplementary sentence includes an unknown word, replace the unknown word in the first supplementary sentence according to the first sentence detection result to obtain a second supplementary sentence.
可选的,第一获取单元301包括:Optionally, the first obtaining unit 301 includes:
获取模块3011,用于获取用户输入的第一语句;The obtaining module 3011 is used to obtain the first sentence input by the user;
分割模块3012,用于利用预置语料断句函数将第一语句进行语句分割,得到第一输入语句;The segmentation module 3012 is used to segment the first sentence by using a preset corpus sentence segmentation function to obtain the first input sentence;
检测模块3013,用于利用预置分析函数对第一输入语句进行语法检测,得到第一语句检测结果,第一语句检测结果为第一输入语句中每个词语之间的目标依存关系。The detection module 3013 is configured to perform grammatical detection on the first input sentence by using a preset analysis function to obtain a first sentence detection result. The first sentence detection result is the target dependency relationship between each word in the first input sentence.
可选的,分割模块3012具体用于:Optionally, the segmentation module 3012 is specifically used for:
对第一语句进行断句,得到分段语句;Segment the first sentence to obtain a segmented sentence;
将分段语句中的分段语料与预置语料进行匹配,预置语料是依据业务数据建立在预置意图规则库中的语料;Match the segmented corpus in the segmented sentence with the preset corpus. The preset corpus is the corpus built in the preset intent rule base based on business data;
若分段语料与预置语料相匹配,则在分段语料的前后位置对分段语句进行分割,得到分割语句,并将分割语句作为第一输入语句;If the segmented corpus matches the preset corpus, segment the segmented sentence before and after the segmented corpus to obtain the segmented sentence, and use the segmented sentence as the first input sentence;
若分段语料与预置语料不相配,则直接将分段语料作为第一输入语句。If the segmented corpus does not match the preset corpus, the segmented corpus is directly used as the first input sentence.
可选的,检测模块3013具体用于:Optionally, the detection module 3013 is specifically used for:
对第一输入语句中的词语进行词性标注以及实体抽取,得到第一语句标注结果;Perform part-of-speech tagging and entity extraction on the words in the first input sentence to obtain the first sentence tagging result;
计算每个词语之间的依存概率,词语为第一语句标注结果中的词语,依存概率为预置依存关系出现的频次;Calculate the dependence probability between each word, the word is the word in the first sentence labeling result, and the dependence probability is the frequency of the preset dependence relationship;
确定每个词语之间的目标依存关系,依存概率权重最大的概率所对应的预置依存关系为词语之间的目标依存关系;Determine the target dependence relationship between each word, and the preset dependence relationship corresponding to the probability with the largest dependence probability weight is the target dependence relationship between words;
获取第一语句检测结果,第一语句检测结果为第一输入语句中每个词语之间的目标依存关系。The first sentence detection result is obtained, and the first sentence detection result is the target dependency relationship between each word in the first input sentence.
可选的,第一补齐单元304具体用于:Optionally, the first complementing unit 304 is specifically used for:
当第二语句检测结果仅包括单实体且第二语句为疑问句时,提取出第二语句检测结果中的单实体,并将单实体作为第一目标词语;When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, extract the single entity in the second sentence detection result, and use the single entity as the first target word;
在第一语句检测结果中筛选与第一目标词语语法结构相同的单实体,得到第二目标词语;Screening single entities with the same grammatical structure as the first target word in the first sentence detection result to obtain the second target word;
在第一语句中,利用第一目标词语替换第二目标词语,得到第一补齐语句。In the first sentence, replace the second target word with the first target word to obtain the first supplementary sentence.
可选的,第二判断单元305具体用于:Optionally, the second judgment unit 305 is specifically configured to:
获取第一补齐语句的检测结果,第一补齐语句的检测结果为第二语句检测结果与第一语句检测结果的结合;Acquiring the detection result of the first supplementary sentence, the detection result of the first supplementary sentence being a combination of the second sentence detection result and the first sentence detection result;
判断第一补齐语句的检测结果中是否包括代词、量词以及冠词。Determine whether the detection result of the first supplementary sentence includes pronouns, quantifiers and articles.
可选的,第二补齐单元306具体用于:Optionally, the second complementing unit 306 is specifically configured to:
若第一补齐语句的检测结果包括指代不明的词语,则提取出指代不明的词语,并将指代不明的词语作为第三目标词语;If the detection result of the first supplementary sentence includes an unclear word, the unclear word is extracted, and the unclear word is used as the third target word;
在第一语句检测结果中筛选与第三目标词语语法结构相同的词语,得到第四目标词语;Select words with the same grammatical structure as the third target word in the first sentence detection result to obtain the fourth target word;
在第二语句中,利用第四目标词语替换第三目标词语,得到第二补齐语句。In the second sentence, replace the third target word with the fourth target word to obtain the second supplementary sentence.
本申请实施例中,第一获取单元301包括:获取模块3011,用于获取用户输入的第一语句;分割模块3012,用于利用预置语料断句函数将第一语句进行语句分割,得到第一输入语句;检测模块3013,用于利用预置分析函数对第一输入语句进行语法检测,得到第一语句检测结果,第一语句检测结果为第一输入语句中每个词语之间的目标依存关系;第二获取单元302利用预置语料断句函数与预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,第二语句检测结果为第二语句中每个词语之间的目标依存关系;第一判断单元303判断第二语句检测结果是否仅包括单实体且第二语句为疑问句;第一补齐单元304当第二语句检测结果仅包括单实体且第二语句为疑问句时, 根据第一语句检测结果将第二语句的语义缺失部分补齐,得到第一补齐语句;第二判断单元305判断第一补齐语句中是否包括指代不明的词语,指代不明的词语包括代词、量词以及冠词;第二补齐单元306若第一补齐语句中包括指代不明的词语,则根据第一语句检测结果将第一补齐语句中指代不明的词语替换,得到第二补齐语句。In the embodiment of the present application, the first obtaining unit 301 includes: an obtaining module 3011, configured to obtain the first sentence input by the user; Input sentence; detection module 3013, used for grammatical detection of the first input sentence using a preset analysis function, to obtain the first sentence detection result, the first sentence detection result is the target dependency relationship between each word in the first input sentence ; The second acquisition unit 302 uses the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in the new round, to obtain the second sentence detection result, and the second sentence detection result is the second The target dependency relationship between each word in the sentence; the first judgment unit 303 judges whether the detection result of the second sentence only includes a single entity and the second sentence is a question sentence; the first complement unit 304 when the detection result of the second sentence only includes a single When the entity and the second sentence is an interrogative sentence, according to the detection result of the first sentence, the semantic missing part of the second sentence is supplemented to obtain the first supplementary sentence; the second judgment unit 305 judges whether the first supplementary sentence includes unknown reference The words that refer to unknown words include pronouns, quantifiers, and articles; if the first supplementary sentence includes an unknown word, the second supplementary unit 306 will refer to the middle of the first supplementary sentence according to the detection result of the first sentence Replace the unknown words and get the second supplementary sentence.
本申请实施例,通过利用预置语料断句函数与预置分析函数对用户输入的多轮对话进行语法检测,以及利用多轮对话上下文的语义对语义不全的语句进行补齐,提高了语义分析结果的准确率,同时也提高了根据语义解析结果搜索对应的应答信息的准确率。In the embodiment of the present application, by using the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on multiple rounds of dialogue input by the user, and using the semantics of the multiple rounds of dialogue context to complement sentences with incomplete semantics, the results of semantic analysis are improved. At the same time, the accuracy of searching the corresponding response information based on the semantic analysis result is also improved.
上面图3至图4从模块化功能实体的角度对本申请实施例中的多轮对话中语义补齐的装置进行详细描述,下面从硬件处理的角度对本申请实施例中多轮对话中语义补齐的设备进行详细描述。The above Figures 3 to 4 describe in detail the semantic complementing device in the multi-round dialogue in the embodiment of the present application from the perspective of the modular functional entity, and the following is the semantic complement in the multi-round dialogue in the embodiment of the present application from the perspective of hardware processing. The equipment is described in detail.
下面结合图5对多轮对话中语义补齐的设备的各个构成部件进行具体的介绍:The following is a detailed introduction to the various components of the semantically supplemented device in the multi-round dialogue in conjunction with Figure 5:
图5是本申请实施例提供的一种多轮对话中语义补齐的设备的结构示意图,该多轮对话中语义补齐的设备500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)501(例如,一个或一个以上处理器)和存储器509,一个或一个以上存储应用程序507或数据506的存储介质508(例如一个或一个以上海量存储设备)。其中,存储器509和存储介质508可以是短暂存储或持久存储。存储在存储介质508的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对多轮对话中语义补齐的设备中的一系列指令操作。更进一步地,处理器501可以设置为与存储介质508通信,在多轮对话中语义补齐的设备500上执行存储介质508中的一系列指令操作。FIG. 5 is a schematic structural diagram of a device for semantic completion in a multi-round dialogue provided by an embodiment of the present application. The device 500 for semantic completion in a multi-round dialogue may have relatively large differences due to different configurations or performance, which may include One or more processors (central processing units, CPU) 501 (for example, one or more processors) and memory 509, and one or more storage media 508 (for example, one or more storage mediums 508 for storing application programs 507 or data 506) 501 (for example, one or more processors) Storage device). Among them, the memory 509 and the storage medium 508 may be short-term storage or persistent storage. The program stored in the storage medium 508 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in a device that complements semantics in multiple rounds of dialogue. Further, the processor 501 may be configured to communicate with the storage medium 508, and execute a series of instruction operations in the storage medium 508 on the device 500 with semantic completion in multiple rounds of dialogue.
多轮对话中语义补齐的设备500还可以包括一个或一个以上电源502,一个或一个以上有线或无线网络接口503,一个或一个以上输入输出接口504,和/或,一个或一个以上操作系统505,例如Windows Serve,Mac OS X,Unix,Linux,FreeBSD等等。本领域技术人员可以理解,图5中示出的多轮对话中语义补齐的设备结构并不构成对多轮对话中语义补齐的设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。The device 500 for supplementing semantics in multiple rounds of dialogue may also include one or more power sources 502, one or more wired or wireless network interfaces 503, one or more input and output interfaces 504, and/or, one or more operating systems 505, such as Windows Serve, Mac OS X, Unix, Linux, FreeBSD, etc. Those skilled in the art can understand that the device structure for semantic completion in multiple rounds of dialogue shown in FIG. 5 does not constitute a limitation on the device for semantic completion in multiple rounds of dialogue, and may include more or less than that shown in the figure. Components, or a combination of certain components, or different component arrangements.
下面结合图5对多轮对话中语义补齐的设备的各个构成部件进行具体的介绍:The following is a detailed introduction to the various components of the semantically supplemented device in the multi-round dialogue in conjunction with Figure 5:
处理器501是多轮对话中语义补齐的设备的控制中心,可以按照多轮对话中语义补齐的方法进行处理。处理器501利用各种接口和线路连接整个多轮对话中语义补齐的设备的各个部分,通过运行或执行存储在存储器509内的软件程序和/或模块,以及调用存储在存储器509内的数据,利用预置语料断句函数与预置分析函数对用户输入的多轮对话进行语法检测,以及利用多轮对话上下文的语义对语义不全的语句进行补齐,提高了语义分析结果的准确率。存储介质508和存储器509都是存储数据的载体,本申请实施例中,存储介质508可以是指储存容量较小,但速度快的内存储器,而存储器509可以是储存容量大,但储存速度慢的外存储器。The processor 501 is the control center of the device that completes semantics in multiple rounds of dialogue, and can perform processing in accordance with the method of semantic complement in multiple rounds of dialogue. The processor 501 uses various interfaces and lines to connect the various parts of the semantically supplemented device in the entire multi-round dialogue, by running or executing software programs and/or modules stored in the memory 509, and calling data stored in the memory 509 , Use preset corpus sentence segmentation function and preset analysis function to grammatically detect multiple rounds of dialogue input by users, and use the semantics of multiple rounds of dialogue context to complement sentences with incomplete semantics, which improves the accuracy of semantic analysis results. The storage medium 508 and the memory 509 are both carriers for storing data. In the embodiment of the present application, the storage medium 508 may refer to an internal memory with a small storage capacity but a fast speed, and the storage medium 509 may have a large storage capacity but a slow storage speed. External memory.
存储器509可用于存储软件程序以及模块,处理器501通过运行存储在存储器509的软件程序以及模块,从而执行多轮对话中语义补齐的设备500的各种功能应用以及数据处理。存储器509可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据多轮对话中语义补齐的设备的使用所创建的数据等。此外,存储器509可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在本申请实施例中提供的多轮对话中语义补齐的程序和接收到的数据流存储在存储器中,当需要使用时,处理器501从存储器509中调用。The memory 509 may be used to store software programs and modules. The processor 501 executes various functional applications and data processing of the device 500 with semantic complement in multiple rounds of dialogues by running the software programs and modules stored in the memory 509. The memory 509 may mainly include a storage program area and a storage data area. The storage program area may store an operating system, an application program required by at least one function, etc.; The created data, etc. In addition, the memory 509 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices. In the multi-round dialogue provided in the embodiment of the present application, the semantically supplemented program and the received data stream are stored in the memory, and the processor 501 is called from the memory 509 when it needs to be used.
在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例的流 程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、双绞线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,光盘)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。When the computer program instructions are loaded and executed on the computer, the procedures or functions according to the embodiments of the present application are generated in whole or in part. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. Computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, computer instructions may be transmitted from a website, computer, server, or data center through a cable (such as Coaxial cable, optical fiber, twisted pair) or wireless (such as infrared, wireless, microwave, etc.) to transmit to another website site, computer, server or data center. The computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, an optical disc), or a semiconductor medium (for example, a solid state disk (SSD)).
可选的,该计算机可读存储介质(或存储介质)可以是非易失性的,也可以是易失性的。Optionally, the computer-readable storage medium (or storage medium) may be non-volatile or volatile.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of the description, the specific working process of the system, device and unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
以上,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。Above, the above embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing various implementations. The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (20)

  1. 一种多轮对话中语义补齐的方法,其中,包括:A method of semantic completion in multiple rounds of dialogue, which includes:
    利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一语句中每个词语之间的目标依存关系;Use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, and the first sentence detection result is each word in the first sentence The goal dependence relationship between;
    利用所述预置语料断句函数与所述预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,所述第二语句检测结果为所述第二语句中每个词语之间的目标依存关系;Use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence inputted by the user in a new round to obtain a second sentence detection result, and the second sentence detection result is the The target dependence relationship between each word in the second sentence;
    判断所述第二语句检测结果是否仅包括单实体且所述第二语句为疑问句;Judging whether the detection result of the second sentence includes only a single entity and the second sentence is a question sentence;
    当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义缺失部分补齐,得到第一补齐语句;When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, fill in the semantic missing part of the second sentence according to the first sentence detection result to obtain a first supplementary sentence;
    判断所述第一补齐语句中是否包括指代不明的词语,所述指代不明的词语包括代词、量词以及冠词;Judging whether the first supplementary sentence includes unidentified words, the unidentified words including pronouns, quantifiers, and articles;
    若所述第一补齐语句中包括指代不明的词语,则根据所述第一语句检测结果将所述第一补齐语句中指代不明的词语替换,得到第二补齐语句。If the first supplementary sentence includes an unidentified word, the unidentified word in the first supplementary sentence is replaced according to the detection result of the first sentence to obtain a second supplementary sentence.
  2. 根据权利要求1所述的方法,其中,所述利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一语句中每个词语之间的目标依存关系包括:The method according to claim 1, wherein said using preset corpus sentence segmentation function and preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, said first sentence The sentence detection result is that the target dependency relationship between each word in the first sentence includes:
    获取用户输入的第一语句;Get the first sentence entered by the user;
    利用预置语料断句函数将所述第一语句进行语句分割,得到第一输入语句;Use a preset corpus sentence segmentation function to segment the first sentence to obtain the first input sentence;
    利用预置分析函数对所述第一输入语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一输入语句中每个词语之间的目标依存关系。A preset analysis function is used to perform grammatical detection on the first input sentence to obtain a first sentence detection result. The first sentence detection result is a target dependency relationship between each word in the first input sentence.
  3. 根据权利要求2所述的方法,其中,所述利用预置语料断句函数将所述第一语句进行语句分割,得到第一输入语句包括:3. The method according to claim 2, wherein said using a preset corpus sentence segmentation function to segment the first sentence to obtain the first input sentence comprises:
    对所述第一语句进行断句,得到分段语句;Segmenting the first sentence to obtain a segmented sentence;
    将所述分段语句中的分段语料与预置语料进行匹配,预置语料是依据业务数据建立在预置意图规则库中的语料;Matching the segmented corpus in the segmented sentence with a preset corpus, the preset corpus is a corpus established in a preset intent rule base based on business data;
    若所述分段语料与所述预置语料相匹配,则在所述分段语料的前后位置对所述分段语句进行分割,得到分割语句,并将所述分割语句作为第一输入语句;If the segmented corpus matches the preset corpus, segment the segmented sentence at the front and back positions of the segmented corpus to obtain a segmented sentence, and use the segmented sentence as the first input sentence;
    若所述分段语料与所述预置语料不相配,则直接将所述分段语料作为第一输入语句。If the segmented corpus does not match the preset corpus, the segmented corpus is directly used as the first input sentence.
  4. 根据权利要求2所述的方法,其中,所述利用预置分析函数对所述第一输入语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一输入语句中每个词语之间的目标依存关系包括:3. The method according to claim 2, wherein the grammatical detection of the first input sentence is performed using a preset analysis function to obtain a first sentence detection result, and the first sentence detection result is the first input sentence The target dependence relationship between each word in the include:
    对所述第一输入语句中的词语进行词性标注以及实体抽取,得到第一语句标注结果;Performing part-of-speech tagging and entity extraction on words in the first input sentence to obtain a first sentence tagging result;
    计算每个所述词语之间的依存概率,所述词语为所述第一语句标注结果中的词语,所述依存概率为预置依存关系出现的频次;Calculating the probability of dependence between each of the words, where the word is a word in the first sentence labeling result, and the dependence probability is the frequency of occurrence of a preset dependence relationship;
    确定每个所述词语之间的目标依存关系,所述依存概率权重最大的概率所对应的所述预置依存关系为所述词语之间的所述目标依存关系;Determining the target dependency relationship between each of the words, and the preset dependency relationship corresponding to the probability with the largest dependency probability weight is the target dependency relationship between the words;
    获取第一语句检测结果,所述第一语句检测结果为所述第一输入语句中每个所述词语之间的所述目标依存关系。Obtain a first sentence detection result, where the first sentence detection result is the target dependency relationship between each of the words in the first input sentence.
  5. 根据权利要求1所述的方法,其中,所述当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义缺失部分补齐,得到第一补齐语句包括:The method according to claim 1, wherein when the second sentence detection result only includes a single entity and the second sentence is an interrogative sentence, the second sentence is determined according to the first sentence detection result. The missing part of semantics is completed, and the first completed sentence includes:
    当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,提取出所述第二 语句检测结果中的单实体,并将所述单实体作为第一目标词语;When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, extract the single entity in the second sentence detection result, and use the single entity as the first target word;
    在所述第一语句检测结果中筛选与所述第一目标词语语法结构相同的所述单实体,得到第二目标词语;Screening the single entity with the same grammatical structure as the first target word from the first sentence detection result to obtain a second target word;
    在所述第一语句中,利用所述第一目标词语替换所述第二目标词语,得到第一补齐语句。In the first sentence, the first target word is used to replace the second target word to obtain a first supplementary sentence.
  6. 根据权利要求1-5任一项所述的方法,其中,所述判断所述第一补齐语句中是否包括指代不明的词语,所述指代不明的词语包括代词、量词以及冠词包括:The method according to any one of claims 1 to 5, wherein said determining whether said first supplementary sentence includes unclear words, said unclear words including pronouns, quantifiers and articles including :
    获取第一补齐语句的检测结果,所述第一补齐语句的检测结果为所述第二语句检测结果与所述第一语句检测结果的结合;Acquiring a detection result of a first supplementary sentence, where the detection result of the first supplementary sentence is a combination of the second sentence detection result and the first sentence detection result;
    判断所述第一补齐语句的检测结果中是否包括代词、量词以及冠词。It is determined whether the detection result of the first supplementary sentence includes pronouns, quantifiers and articles.
  7. 根据权利要求6所述的方法,其中,所述若所述第一补齐语句中包括指代不明的词语,则根据所述第一语句检测结果将所述第一补齐语句中指代不明的词语替换,得到第二补齐语句包括:7. The method according to claim 6, wherein if the first supplementary sentence includes an unknown term, then the first supplementary sentence refers to the unknown term according to the first sentence detection result Word replacement, get the second complement sentence including:
    若所述第一补齐语句的检测结果包括指代不明的词语,则提取出所述指代不明的词语,并将所述指代不明的词语作为第三目标词语;If the detection result of the first supplementary sentence includes an unclear word, extract the unclear word, and use the unclear word as the third target word;
    在所述第一语句检测结果中筛选与所述第三目标词语语法结构相同的词语,得到第四目标词语;Screening words with the same grammatical structure as the third target word from the first sentence detection result to obtain the fourth target word;
    在所述第二语句中,利用所述第四目标词语替换所述第三目标词语,得到第二补齐语句。In the second sentence, the fourth target word is used to replace the third target word to obtain a second supplementary sentence.
  8. 一种多轮对话中语义补齐的装置,其中,包括:A device for complementing semantics in multiple rounds of dialogue, which includes:
    第一获取单元,用于利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一语句中每个词语之间的目标依存关系;The first acquisition unit is configured to use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, and the first sentence detection result is the The target dependence relationship between each word in the first sentence;
    第二获取单元,用于利用所述预置语料断句函数与所述预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,所述第二语句检测结果为所述第二语句中每个词语之间的目标依存关系;The second acquisition unit is configured to use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in a new round to obtain the second sentence detection result. The second sentence detection result is the target dependency relationship between each word in the second sentence;
    第一判断单元,用于判断所述第二语句检测结果是否仅包括单实体且所述第二语句为疑问句;The first judgment unit is configured to judge whether the detection result of the second sentence includes only a single entity and the second sentence is a question sentence;
    第一补齐单元,用于当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义缺失部分补齐,得到第一补齐语句;The first complementing unit is used to fill in the missing semantic parts of the second sentence according to the first sentence detection result when the second sentence detection result only includes a single entity and the second sentence is a question sentence , Get the first supplementary sentence;
    第二判断单元,用于判断所述第一补齐语句中是否包括指代不明的词语,所述指代不明的词语包括代词、量词以及冠词;The second judging unit is used to judge whether the first supplementary sentence includes unclear words, and the unclear words include pronouns, quantifiers and articles;
    第二补齐单元,用于若所述第一补齐语句中包括指代不明的词语,则根据所述第一语句检测结果将所述第一补齐语句中指代不明的词语替换,得到第二补齐语句。The second supplementary unit is used to replace the unknown word in the first supplementary sentence according to the detection result of the first sentence if the first supplementary sentence includes an unknown word to obtain the first supplementary sentence. Two complete sentences.
  9. 一种多轮对话中语义补齐的设备,其中,包括:A device for complementing semantics in multiple rounds of dialogue, which includes:
    存储器和至少一个处理器,所述存储器中存储有指令,所述存储器和所述至少一个处理器通过线路互联;A memory and at least one processor, where instructions are stored in the memory, and the memory and the at least one processor are interconnected by wires;
    所述至少一个处理器调用所述存储器中的所述指令,以使得所述多轮对话中语义补齐的设备执行以下步骤:The at least one processor invokes the instructions in the memory, so that the semantically-completed device in the multi-round dialogue executes the following steps:
    利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一语句中每个词语之间的目标依存关系;Use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, and the first sentence detection result is each word in the first sentence The goal dependence relationship between;
    利用所述预置语料断句函数与所述预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,所述第二语句检测结果为所述第二语句中每 个词语之间的目标依存关系;Use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in a new round to obtain a second sentence detection result, and the second sentence detection result is the The target dependence relationship between each word in the second sentence;
    判断所述第二语句检测结果是否仅包括单实体且所述第二语句为疑问句;Judging whether the detection result of the second sentence includes only a single entity and the second sentence is a question sentence;
    当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义缺失部分补齐,得到第一补齐语句;When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, fill in the semantic missing part of the second sentence according to the first sentence detection result to obtain a first supplementary sentence;
    判断所述第一补齐语句中是否包括指代不明的词语,所述指代不明的词语包括代词、量词以及冠词;Judging whether the first supplementary sentence includes unidentified words, the unidentified words including pronouns, quantifiers, and articles;
    若所述第一补齐语句中包括指代不明的词语,则根据所述第一语句检测结果将所述第一补齐语句中指代不明的词语替换,得到第二补齐语句。If the first supplementary sentence includes an unidentified word, the unidentified word in the first supplementary sentence is replaced according to the detection result of the first sentence to obtain a second supplementary sentence.
  10. 根据权利要求9所述的设备,其中,所述利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果时,具体执行以下步骤:9. The device according to claim 9, wherein the preset corpus segmentation function and the preset analysis function are used to perform grammatical detection on the acquired first sentence input by the user, and when the first sentence detection result is obtained, the following is specifically executed step:
    获取用户输入的第一语句;Get the first sentence entered by the user;
    利用预置语料断句函数将所述第一语句进行语句分割,得到第一输入语句;Use a preset corpus sentence segmentation function to segment the first sentence to obtain the first input sentence;
    利用预置分析函数对所述第一输入语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一输入语句中每个词语之间的目标依存关系。A preset analysis function is used to perform grammatical detection on the first input sentence to obtain a first sentence detection result. The first sentence detection result is a target dependency relationship between each word in the first input sentence.
  11. 根据权利要求10所述的设备,其中,所述利用预置语料断句函数将所述第一语句进行语句分割,得到第一输入语句时,具体执行以下步骤:11. The device according to claim 10, wherein when said first sentence is segmented using a preset corpus sentence segmentation function to obtain the first input sentence, the following steps are specifically executed:
    对所述第一语句进行断句,得到分段语句;Segmenting the first sentence to obtain a segmented sentence;
    将所述分段语句中的分段语料与预置语料进行匹配,预置语料是依据业务数据建立在预置意图规则库中的语料;Matching the segmented corpus in the segmented sentence with a preset corpus, the preset corpus is a corpus established in a preset intent rule base based on business data;
    若所述分段语料与所述预置语料相匹配,则在所述分段语料的前后位置对所述分段语句进行分割,得到分割语句,并将所述分割语句作为第一输入语句;If the segmented corpus matches the preset corpus, segment the segmented sentence at the front and back positions of the segmented corpus to obtain a segmented sentence, and use the segmented sentence as the first input sentence;
    若所述分段语料与所述预置语料不相配,则直接将所述分段语料作为第一输入语句。If the segmented corpus does not match the preset corpus, the segmented corpus is directly used as the first input sentence.
  12. 根据权利要求10所述的设备,其中,所述利用预置分析函数对所述第一输入语句进行语法检测,得到第一语句检测结果时,具体执行以下步骤:10. The device according to claim 10, wherein the grammatical detection of the first input sentence is performed by using a preset analysis function, and when the first sentence detection result is obtained, the following steps are specifically executed:
    对所述第一输入语句中的词语进行词性标注以及实体抽取,得到第一语句标注结果;Performing part-of-speech tagging and entity extraction on words in the first input sentence to obtain a first sentence tagging result;
    计算每个所述词语之间的依存概率,所述词语为所述第一语句标注结果中的词语,所述依存概率为预置依存关系出现的频次;Calculating the probability of dependence between each of the words, where the word is a word in the first sentence labeling result, and the dependence probability is the frequency of occurrence of a preset dependence relationship;
    确定每个所述词语之间的目标依存关系,所述依存概率权重最大的概率所对应的所述预置依存关系为所述词语之间的所述目标依存关系;Determining the target dependency relationship between each of the words, and the preset dependency relationship corresponding to the probability with the largest dependency probability weight is the target dependency relationship between the words;
    获取第一语句检测结果,所述第一语句检测结果为所述第一输入语句中每个所述词语之间的所述目标依存关系。Obtain a first sentence detection result, where the first sentence detection result is the target dependency relationship between each of the words in the first input sentence.
  13. 根据权利要求9所述的设备,其中,所述当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义缺失部分补齐,得到第一补齐语句时,具体执行以下步骤:The device according to claim 9, wherein when the second sentence detection result includes only a single entity and the second sentence is a question sentence, the second sentence is determined according to the first sentence detection result. When the missing part of semantics is completed and the first completed sentence is obtained, the following steps are specifically performed:
    当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,提取出所述第二语句检测结果中的单实体,并将所述单实体作为第一目标词语;When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, extract the single entity in the second sentence detection result, and use the single entity as the first target word;
    在所述第一语句检测结果中筛选与所述第一目标词语语法结构相同的所述单实体,得到第二目标词语;Screening the single entity with the same grammatical structure as the first target word from the first sentence detection result to obtain a second target word;
    在所述第一语句中,利用所述第一目标词语替换所述第二目标词语,得到第一补齐语句。In the first sentence, the first target word is used to replace the second target word to obtain a first supplementary sentence.
  14. 根据权利要求9-13任一项所述的设备,其中,所述判断所述第一补齐语句中是否包括指代不明的词语时,具体执行以下步骤:The device according to any one of claims 9-13, wherein when determining whether the first supplementary sentence includes an unclear word, the following steps are specifically executed:
    获取第一补齐语句的检测结果,所述第一补齐语句的检测结果为所述第二语句检测结 果与所述第一语句检测结果的结合;Acquiring a detection result of a first supplementary sentence, where the detection result of the first supplementary sentence is a combination of the second sentence detection result and the first sentence detection result;
    判断所述第一补齐语句的检测结果中是否包括代词、量词以及冠词。It is determined whether the detection result of the first supplementary sentence includes pronouns, quantifiers and articles.
  15. 根据权利要求14所述的设备,其中,所述若所述第一补齐语句中包括指代不明的词语,则根据所述第一语句检测结果将所述第一补齐语句中指代不明的词语替换,得到第二补齐语句时,具体执行以下步骤:14. The device according to claim 14, wherein if the first supplementary sentence includes an unknown term, the first supplementary sentence refers to the unknown term according to the first sentence detection result For word replacement, when the second supplementary sentence is obtained, the following steps are specifically performed:
    若所述第一补齐语句的检测结果包括指代不明的词语,则提取出所述指代不明的词语,并将所述指代不明的词语作为第三目标词语;If the detection result of the first supplementary sentence includes an unclear word, extract the unclear word, and use the unclear word as the third target word;
    在所述第一语句检测结果中筛选与所述第三目标词语语法结构相同的词语,得到第四目标词语;Screening words with the same grammatical structure as the third target word from the first sentence detection result to obtain the fourth target word;
    在所述第二语句中,利用所述第四目标词语替换所述第三目标词语,得到第二补齐语句。In the second sentence, the fourth target word is used to replace the third target word to obtain a second supplementary sentence.
  16. 一种计算机可读存储介质,其中,包括指令,当所述指令在计算机上运行时,使得计算机执行以下步骤:A computer-readable storage medium, which includes instructions, when the instructions run on a computer, cause the computer to perform the following steps:
    利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一语句中每个词语之间的目标依存关系;Use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired first sentence input by the user to obtain the first sentence detection result, and the first sentence detection result is each word in the first sentence The goal dependence relationship between;
    利用所述预置语料断句函数与所述预置分析函数,对获取到的用户新一轮输入的第二语句进行语法检测,得到第二语句检测结果,所述第二语句检测结果为所述第二语句中每个词语之间的目标依存关系;Use the preset corpus sentence segmentation function and the preset analysis function to perform grammatical detection on the acquired second sentence input by the user in a new round to obtain a second sentence detection result, and the second sentence detection result is the The target dependence relationship between each word in the second sentence;
    判断所述第二语句检测结果是否仅包括单实体且所述第二语句为疑问句;Judging whether the detection result of the second sentence includes only a single entity and the second sentence is a question sentence;
    当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义缺失部分补齐,得到第一补齐语句;When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, fill in the semantic missing part of the second sentence according to the first sentence detection result to obtain a first supplementary sentence;
    判断所述第一补齐语句中是否包括指代不明的词语,所述指代不明的词语包括代词、量词以及冠词;Judging whether the first supplementary sentence includes unidentified words, the unidentified words including pronouns, quantifiers, and articles;
    若所述第一补齐语句中包括指代不明的词语,则根据所述第一语句检测结果将所述第一补齐语句中指代不明的词语替换,得到第二补齐语句。If the first supplementary sentence includes an unidentified word, the unidentified word in the first supplementary sentence is replaced according to the detection result of the first sentence to obtain a second supplementary sentence.
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述利用预置语料断句函数与预置分析函数,对获取到的用户输入的第一语句进行语法检测,得到第一语句检测结果时,具体执行以下步骤:The computer-readable storage medium according to claim 16, wherein the grammatical check is performed on the obtained first sentence input by the user by using the preset corpus sentence segmentation function and the preset analysis function, and when the first sentence detection result is obtained , Perform the following steps:
    获取用户输入的第一语句;Get the first sentence entered by the user;
    利用预置语料断句函数将所述第一语句进行语句分割,得到第一输入语句;Use a preset corpus sentence segmentation function to segment the first sentence to obtain the first input sentence;
    利用预置分析函数对所述第一输入语句进行语法检测,得到第一语句检测结果,所述第一语句检测结果为所述第一输入语句中每个词语之间的目标依存关系。A preset analysis function is used to perform grammatical detection on the first input sentence to obtain a first sentence detection result. The first sentence detection result is a target dependency relationship between each word in the first input sentence.
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述利用预置语料断句函数将所述第一语句进行语句分割,得到第一输入语句时,具体执行以下步骤:18. The computer-readable storage medium according to claim 17, wherein when the first sentence is segmented by using a preset corpus sentence segmentation function to obtain the first input sentence, the following steps are specifically executed:
    对所述第一语句进行断句,得到分段语句;Segmenting the first sentence to obtain a segmented sentence;
    将所述分段语句中的分段语料与预置语料进行匹配,预置语料是依据业务数据建立在预置意图规则库中的语料;Matching the segmented corpus in the segmented sentence with a preset corpus, the preset corpus is a corpus established in a preset intent rule base based on business data;
    若所述分段语料与所述预置语料相匹配,则在所述分段语料的前后位置对所述分段语句进行分割,得到分割语句,并将所述分割语句作为第一输入语句;If the segmented corpus matches the preset corpus, segment the segmented sentence at the front and back positions of the segmented corpus to obtain a segmented sentence, and use the segmented sentence as the first input sentence;
    若所述分段语料与所述预置语料不相配,则直接将所述分段语料作为第一输入语句。If the segmented corpus does not match the preset corpus, the segmented corpus is directly used as the first input sentence.
  19. 根据权利要求17所述的计算机可读存储介质,其中,所述利用预置分析函数对所述第一输入语句进行语法检测,得到第一语句检测结果时,具体执行以下步骤:18. The computer-readable storage medium according to claim 17, wherein the grammatical detection of the first input sentence is performed by using a preset analysis function, and when the first sentence detection result is obtained, the following steps are specifically executed:
    对所述第一输入语句中的词语进行词性标注以及实体抽取,得到第一语句标注结果;Performing part-of-speech tagging and entity extraction on words in the first input sentence to obtain a first sentence tagging result;
    计算每个所述词语之间的依存概率,所述词语为所述第一语句标注结果中的词语,所述依存概率为预置依存关系出现的频次;Calculating the probability of dependence between each of the words, where the word is a word in the first sentence labeling result, and the dependence probability is the frequency of occurrence of a preset dependence relationship;
    确定每个所述词语之间的目标依存关系,所述依存概率权重最大的概率所对应的所述预置依存关系为所述词语之间的所述目标依存关系;Determining the target dependency relationship between each of the words, and the preset dependency relationship corresponding to the probability with the largest dependency probability weight is the target dependency relationship between the words;
    获取第一语句检测结果,所述第一语句检测结果为所述第一输入语句中每个所述词语之间的所述目标依存关系。Obtain a first sentence detection result, where the first sentence detection result is the target dependency relationship between each of the words in the first input sentence.
  20. 根据权利要求16所述的计算机可读存储介质,其中,所述当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,根据所述第一语句检测结果将所述第二语句的语义缺失部分补齐,得到第一补齐语句时,具体执行以下步骤:The computer-readable storage medium according to claim 16, wherein when the second sentence detection result includes only a single entity and the second sentence is a question sentence, the first sentence detection result is Complete the missing part of the semantics of the second sentence. When the first supplementary sentence is obtained, the following steps are specifically performed:
    当所述第二语句检测结果仅包括单实体且所述第二语句为疑问句时,提取出所述第二语句检测结果中的单实体,并将所述单实体作为第一目标词语;When the second sentence detection result includes only a single entity and the second sentence is an interrogative sentence, extract the single entity in the second sentence detection result, and use the single entity as the first target word;
    在所述第一语句检测结果中筛选与所述第一目标词语语法结构相同的所述单实体,得到第二目标词语;Screening the single entity with the same grammatical structure as the first target word from the first sentence detection result to obtain a second target word;
    在所述第一语句中,利用所述第一目标词语替换所述第二目标词语,得到第一补齐语句。In the first sentence, the first target word is used to replace the second target word to obtain a first supplementary sentence.
PCT/CN2020/098846 2020-02-12 2020-06-29 Method, device, and equipment for semantic completion in a multi-round dialogue, and storage medium WO2021159656A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010088078.9 2020-02-12
CN202010088078.9A CN111325034A (en) 2020-02-12 2020-02-12 Method, device, equipment and storage medium for semantic completion in multi-round conversation

Publications (1)

Publication Number Publication Date
WO2021159656A1 true WO2021159656A1 (en) 2021-08-19

Family

ID=71168824

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/098846 WO2021159656A1 (en) 2020-02-12 2020-06-29 Method, device, and equipment for semantic completion in a multi-round dialogue, and storage medium

Country Status (2)

Country Link
CN (1) CN111325034A (en)
WO (1) WO2021159656A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325034A (en) * 2020-02-12 2020-06-23 平安科技(深圳)有限公司 Method, device, equipment and storage medium for semantic completion in multi-round conversation
CN111858854B (en) * 2020-07-20 2024-03-19 上海汽车集团股份有限公司 Question-answer matching method and relevant device based on historical dialogue information
CN111858894A (en) * 2020-07-29 2020-10-30 网易(杭州)网络有限公司 Semantic missing recognition method and device, electronic equipment and storage medium
CN111966807A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 Text processing method and device of question-answering system
CN112183060B (en) * 2020-09-28 2022-05-10 重庆工商大学 Reference resolution method of multi-round dialogue system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107402913A (en) * 2016-05-20 2017-11-28 腾讯科技(深圳)有限公司 The determination method and apparatus of antecedent
CN109697282A (en) * 2017-10-20 2019-04-30 阿里巴巴集团控股有限公司 A kind of the user's intension recognizing method and device of sentence
CN109918494A (en) * 2019-03-22 2019-06-21 深圳狗尾草智能科技有限公司 Context relation based on figure replys generation method, computer and medium
US20190243900A1 (en) * 2017-03-03 2019-08-08 Tencent Technology (Shenzhen) Company Limited Automatic questioning and answering processing method and automatic questioning and answering system
CN111325034A (en) * 2020-02-12 2020-06-23 平安科技(深圳)有限公司 Method, device, equipment and storage medium for semantic completion in multi-round conversation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107402913A (en) * 2016-05-20 2017-11-28 腾讯科技(深圳)有限公司 The determination method and apparatus of antecedent
US20190243900A1 (en) * 2017-03-03 2019-08-08 Tencent Technology (Shenzhen) Company Limited Automatic questioning and answering processing method and automatic questioning and answering system
CN109697282A (en) * 2017-10-20 2019-04-30 阿里巴巴集团控股有限公司 A kind of the user's intension recognizing method and device of sentence
CN109918494A (en) * 2019-03-22 2019-06-21 深圳狗尾草智能科技有限公司 Context relation based on figure replys generation method, computer and medium
CN111325034A (en) * 2020-02-12 2020-06-23 平安科技(深圳)有限公司 Method, device, equipment and storage medium for semantic completion in multi-round conversation

Also Published As

Publication number Publication date
CN111325034A (en) 2020-06-23

Similar Documents

Publication Publication Date Title
WO2021159656A1 (en) Method, device, and equipment for semantic completion in a multi-round dialogue, and storage medium
JP5936698B2 (en) Word semantic relation extraction device
WO2022022045A1 (en) Knowledge graph-based text comparison method and apparatus, device, and storage medium
US10657325B2 (en) Method for parsing query based on artificial intelligence and computer device
US11036726B2 (en) Generating nested database queries from natural language queries
US11816441B2 (en) Device and method for machine reading comprehension question and answer
US20170031901A1 (en) Method and Device for Machine Translation
US10783877B2 (en) Word clustering and categorization
CN110457708B (en) Vocabulary mining method and device based on artificial intelligence, server and storage medium
CN109408811B (en) Data processing method and server
US10460028B1 (en) Syntactic graph traversal for recognition of inferred clauses within natural language inputs
US20130060769A1 (en) System and method for identifying social media interactions
CN107992585A (en) Universal tag method for digging, device, server and medium
WO2020232943A1 (en) Knowledge graph construction method for event prediction and event prediction method
CN110991180A (en) Command identification method based on keywords and Word2Vec
US20220058191A1 (en) Conversion of natural language query
CN114036955B (en) Detection method for headword event argument of central word
KR101851791B1 (en) Apparatus and method for computing domain diversity using domain-specific terms and high frequency general terms
CN109063184A (en) Multilingual newsletter archive clustering method, storage medium and terminal device
KR101851786B1 (en) Apparatus and method for generating undefined label for labeling training set of chatbot
CN110717021A (en) Input text and related device for obtaining artificial intelligence interview
Korpusik et al. Data collection and language understanding of food descriptions
TWI640877B (en) Semantic analysis apparatus, method, and computer program product thereof
US20220147719A1 (en) Dialogue management
WO2021129411A1 (en) Text processing method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20918221

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20918221

Country of ref document: EP

Kind code of ref document: A1