CN112650846A - Question-answer intention knowledge base construction system and method based on question frame - Google Patents

Question-answer intention knowledge base construction system and method based on question frame Download PDF

Info

Publication number
CN112650846A
CN112650846A CN202110040888.1A CN202110040888A CN112650846A CN 112650846 A CN112650846 A CN 112650846A CN 202110040888 A CN202110040888 A CN 202110040888A CN 112650846 A CN112650846 A CN 112650846A
Authority
CN
China
Prior art keywords
question
sentence
frame
answer
frame element
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110040888.1A
Other languages
Chinese (zh)
Other versions
CN112650846B (en
Inventor
侯志强
柳晶晶
刘锋
谭培波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhitong Yunlian Technology Co Ltd
Original Assignee
Beijing Zhitong Yunlian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhitong Yunlian Technology Co Ltd filed Critical Beijing Zhitong Yunlian Technology Co Ltd
Priority to CN202110040888.1A priority Critical patent/CN112650846B/en
Publication of CN112650846A publication Critical patent/CN112650846A/en
Application granted granted Critical
Publication of CN112650846B publication Critical patent/CN112650846B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a question-answer intention knowledge base construction system and method based on a question sentence frame. The method comprises the following steps: the data layer comprises a question corpus, a frame element dictionary and a question-answer intention knowledge base; the system is used for storing files, reading and writing files and modifying files; the processing layer comprises a frame element processing module and a question-answer sentence rewriting module and is used for rewriting question sentences; and the application layer comprises a question analysis module and is used for outputting candidate target word strings formed by rewriting the question. The question-answer intention knowledge base construction system and method based on the question frame improve the problems that in the prior art, frame elements are difficult to identify and the question form of question analysis cannot be automatically obtained.

Description

Question-answer intention knowledge base construction system and method based on question frame
Technical Field
The invention relates to the technical field of constructing a question-answer intention knowledge base, in particular to a question-answer intention knowledge base construction system and method based on a question sentence frame.
Background
Sentence intention (sentence frame) is the meaning of a sentence in the real physical world, i.e. the semantics, and there are many semantics, and generally, a frame semantics (FrameNet) method is adopted, the name and frame elements of the frame are determined according to the scene where the sentence is located, and the target word of the frame is defined according to the predicate or verb in the sentence. Such defining a target word of the entire sentence by a predicate or a verb as a part of the sentence and deciding a frame element has the following problem in practice:
(1) the entity ambiguity can not be eliminated and the frame element can not be identified
For example, "how deep is the well of the capillary dam 3", where "capillary dam 3" and "capillary dam 3 well" are 2 real entities but completely different types, then the entity in the question is "capillary dam 3 well deep" or "capillary dam 3 well deep"? This ambiguity problem cannot be solved at the word level, and the actual intentions and elements of the questioner can only be corrected at a higher sentence level by the knowledge base.
(2) Frame for identifying no-moving words and sentences
Although english is a verb-based language, recognition of frames and frame elements by verbs is successful, but it is clear that target words cannot be defined when a question is intended (frame) to be recognized, and thus frames and frame elements of a question cannot be determined, and question and answer cannot be analyzed.
For example, "the depth of the well in the capillary dam 3" is "the semantic of the sentence in the question-answering scene is very clear," the depth of the well "of each well included in the gas gathering station" in the capillary dam 3 "is asked, but there is no verb in the question and only noun, and NLP cannot recognize the frame and frame elements of the sentence.
(3) No sequence of null words and no frame recognition
By using the slot position method, the entity is removed from the sentence, and the frame recognition is performed by using the remaining dummy word sequence as the target word, so that the frame and the frame element of the sentence cannot be recognized because only half of the information is applied.
For a sentence with a null word, such as "the depth of the well in the capillary dam 3" how much "the frame and the frame elements of the sentence can be identified by removing the solid word" the capillary dam 3 "and" the well depth "and reserving the slot position, the frame and the frame elements of the sentence can be identified by the target word.
(4) The analysis of question sentence can not automatically obtain answer sentence form
Because the question and the answer are presented in pairs, the question and the answer are different, so as to keep the consistency of the words, tone and semantics of the question and the answer. However, if the question is only asked without considering the question, a fluent question form conforming to the scene and the semantics cannot be obtained.
Disclosure of Invention
The invention aims to provide a question-answer intention knowledge base construction method based on a question frame, which can solve the problems that in the prior art, frame elements are difficult to identify and answer sentence forms of question analysis cannot be automatically obtained.
In order to achieve the above purpose, the invention provides the following technical scheme:
a question-answer intention knowledge base construction system based on a question frame comprises: the data layer comprises a question corpus, a frame element dictionary and a question-answer intention knowledge base; the system is used for storing files, reading and writing files and modifying files;
the processing layer comprises a frame element processing module and a question-answer sentence rewriting module and is used for rewriting sentences;
and the application layer comprises a question analysis module and is used for outputting candidate target word strings formed by the rewritten sentences.
On the basis of the technical scheme, the invention can be further improved as follows:
furthermore, the question corpus comprises sequence numbers, question sources and questions, and is used for recording relevant information of the questions.
Further, the format of the frame element dictionary comprises a frame name and a frame element code; the frame element dictionary comprises a question parsing layer, the question parsing layer comprises a first layer and a second layer, the first layer is used for sequence parsing, and the second layer is used for implication relation and hierarchical structure parsing.
Further, the question-answer intention knowledge base comprises question target word strings and question-answer intention analysis, the question-answer intention analysis comprises a first part and a second part, the first part is the name of the frame, and the second part is an answer sentence template.
Further, the frame element processing module is used for finding word strings from the frame element dictionary.
Further, the rewriting module is configured to perform character replacement on word strings in the sentence to complete rewriting of the sentence, and each rewritten sentence is added to the rewritten sentence set as a new original sentence and accumulated until all the frame element character strings are used, so as to obtain a rewritten sentence set.
Further, the question analysis module is used for establishing a read-in question list and outputting a frame target word string formed by the rewritten sentences in a reverse sequencing mode according to the length of the word string.
A question-answer intention knowledge base construction method based on a question frame specifically comprises the following steps:
s101, constructing a sentence frame element dictionary according to the frame element dictionary and the question document;
s102, circulating the sentence frame element dictionary;
s103, circulating the existing sentence target word string set to form a new candidate target word string set, and keeping the sentence in the new candidate target word string set;
s104, replacing corresponding words in the target word string with the sentence frame element dictionary, and updating the candidate target word string set;
and S105, sorting according to the length of the candidate target word string, and outputting the candidate target word string.
Further, constructing a sentence frame element dictionary specifically includes: and searching each sentence in the question document, and collecting the sentences in the frame element dictionary to form a sentence frame element dictionary when the words in the frame element dictionary appear in the sentences.
The invention has the following advantages:
the question-answer intention knowledge base construction system and method based on the question frame furthest adopt the information of all characters and word sequences in the question, furthest reserve the semantics of a question target word string, not only can effectively eliminate word-level object ambiguity, but also can realize the identification of the frame name of the question, the analysis of frame elements and the generation of answer sentences through the target word string of the question, and solve the analysis of the question-answer intention in the question-answer system; the problem of in the prior art the frame element is difficult to discern and question sentence analytic answer form can't be obtained automatically is solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of a system for constructing a knowledge base of question and answer intentions according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a method for constructing a knowledge base of question and answer intentions according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating the encoding rules of question-answering frame names and the definition of frame elements according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a format of a question corpus according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a frame element dictionary format according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating a knowledge base format of a question-answer chart in an embodiment of the present invention.
Description of reference numerals:
the system comprises a data layer 10, a question corpus 101, a frame element dictionary 102, a question-answer intention knowledge base 103, a processing layer 20, a frame element processing module 201, a question-answer sentence rewriting module 202, an application layer 30 and a question analysis module 301.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, the invention provides a question-answer intention knowledge base construction system based on a question sentence framework, which comprises:
a data layer 10, which comprises a question corpus 101, a frame element dictionary 102 and a question-answer intention knowledge base 103; the system is used for storing files, reading and writing files and modifying files;
the processing layer 20 comprises a frame element processing module 201 and a question-answer sentence rewriting module 202, and is used for rewriting sentences;
the application layer 30 includes a question analysis module 301 for outputting candidate target word strings formed by rewritten sentences.
Sentence intention (sentence frame) is the meaning of a sentence in the real physical world, i.e. the semantics, and there are many semantics, and generally, a frame semantics (FrameNet) method is adopted, the name and frame elements of the frame are determined according to the scene where the sentence is located, and the target word of the frame is defined according to the predicate or verb in the sentence.
For example, a frame named "talk and talk" is defined in a social scenario, and generally has frame elements such as "sender", "receiver", "information", "media", etc., and verbs included in sentences such as "say", "speak", "talk", "command", "tell", "discuss", "remind", "ask", "promise", "warn", "threat", etc., are all target words of this frame.
For a specific sentence, the frame to which the sentence belongs is determined by recognizing the target word. For example, "zhang san tells li si where airport is," telling "this target word says that the sentence belongs to the" talking and meeting "frame," zhang san "," li si "," where airport is "and so on are all frame elements of the frame.
The entity words in the question are replaced by the frame elements to form a complete word string with the combination of virtuality and reality, so that the information of all the words in the sentence is utilized to the maximum extent to construct the most complete sentence frame target word string.
Specifically, the entire question-answer (including question and answer) frame and the question-answer frame element codes (such as frame number F111, frame element code T, time, O object, P parameter) are first determined according to the question.
Secondly, constructing a frame element dictionary 102 according to the frame and the frame element codes of the sentence; and then replacing the corresponding entity words in the original question sentence and answer sentence by the frame element codes, and rewriting the question sentence and answer sentence (for example, how much the gas production rate of T1001 well No. 1/3 in 2020 is "is rewritten into how much the TOP is F111# TOP is Q").
Defining the rewritten question sentence as a target word under the selected frame for identifying the frame, and using the rewritten answer sentence as an answer sentence template to form a question-answer frame corresponding to the sentence together with the question sentence; putting the question and answer frames of all question sentences together to form the whole question and answer knowledge base.
Further, as shown in fig. 4, the question corpus 101 includes sequence numbers, question sources, and questions, and is used to record relevant information of the questions.
The method is used for recording question related information, and the information can expand information such as region increase and questioner increase, and provides for more accurate question answering in the future.
Further, as shown in fig. 5, the format of the frame element dictionary 102 includes the name of the frame and the frame element number; the frame element dictionary 102 includes a question parsing including a first layer for sequence parsing and a second layer for inclusion relation and hierarchy parsing in consideration of the granularity of time in the chinese natural language and the inclusion relation of the natural language such as "how much oil is produced". The question-answer frame name definition and the frame element definition are shown in fig. 2, wherein the question defines 5 elements of TOPVM, and the answer is defined as 6 elements such as TOPVMQ.
Further, as shown in the knowledge base format of the question-answer graph of fig. 6, the knowledge base 103 of question-answer intentions includes a question target word string and a question-answer intention analysis, and the question-answer intention analysis includes a first part and a second part, wherein the question-answer intention analysis includes the first part and the second part separated by "@ @ @", the first part is a name of the frame, and the second part is an answer sentence template.
Further, the frame element processing module 201 is configured to find word strings from the frame element dictionary 102.
Further, the rewriting module is configured to perform character replacement on word strings in the sentence to complete rewriting of the sentence, and each rewritten sentence is added to the rewritten sentence set as a new original sentence and accumulated until all the frame element character strings are used, so as to obtain a rewritten sentence set.
The target word string set after the original question sentence is rewritten not only contains the sentence pattern constructed by the virtual word part of the original sentence, but also contains the frame or slot position constructed by the real word part, therefore, the method utilizes the information in the sentence and is the most complete construction mode for the target word of the frame of the sentence.
Further, the question analysis module 301 is configured to establish a read-in question list, and perform reverse sorting and output on a frame target word string formed by rewriting the question according to the length of the word string. And the method is used for selecting and checking the proper frame target word string. For a simple question, "what is the gas production rate of 401-1 well", since the number of dictionaries is large, generally on the order of 100 ten thousand, 163 possible frame target word strings will be output, where only the sentence "what is the OP" is the correct question frame target word string. The more complicated the question is, the more frame elements are included, and the larger the number of frame target word strings to be output.
A question-answer intention knowledge base construction method based on a question frame specifically comprises the following steps:
s101, constructing a sentence frame element dictionary 102;
in this step, a sentence frame element dictionary 102 is constructed according to the frame element dictionary 102 and the question document; the frame element dictionary 102 is opened, and the whole dictionary is read into the memory, so that the processing speed is increased. Opening an input question file to process each sentence in the question file; the sentences in the question file are read according to the sentence. The frame elements corresponding to each question in the sentence-frame-element dictionary 102dic are different, and therefore, each sentence needs to be searched independently.
Whenever a word in the frame element dictionary 102 appears in a sentence, it is collected in the sentence frame element dictionary 102. This sentence frame element dictionary 102 is large in number and may have a complicated inclusion relationship, for example, "1" may be month data or a well number of an entity, and needs to be fully included in the sentence frame element dictionary 102.
S102, circulating the sentence frame element dictionary 102;
in this step, the sentence frame element dictionary 102 is circulated; looping according to the sentence frame element dictionary 102 dic; constituting a loop over the whole element dictionary. The dictionary is exhaustively exhausted in every case, and the possibility of constructing candidate target word strings by rewriting each sentence is checked.
S103, circulating the existing sentence target word string set;
in the step, the existing sentence target word string set is circulated to form a new candidate target word string set, and the sentence is kept in the new candidate target word string set; circulating the existing sentence target word string set; and constituting the update of the target word string set to the single element dictionary. The existing sentence target word string set is a variable and increasing set because the number of candidate target word string sets increases every time a sentence replaces a word in an element dictionary.
Here, all the target word strings are updated once. Retaining the sentence in a new set of candidate target word strings; the original sentence is kept in the target candidate word string set all the time, and the expansion of the candidate word string set is realized. And a mode of adding new sets is adopted to realize automatic duplicate removal of sentences.
S104, updating a candidate target word string set;
in this step, the sentence frame element dictionary 102 is used to replace the corresponding words in the target word string, and the candidate target word string set is updated; replacing the corresponding word in the target word string with the sentence frame element dictionary 102 dic; the sentence after the new word replacement is also added to the target word string set. It should be noted that if there are several identical element words in a sentence, only one of the element words can be replaced each time, and the rest of the element words are to be replaced by the subsequent element words.
The next element word is processed together with the step S102, and the key point is that the target word string set is an updated new set, and all unrepeated candidate target word strings processed from the original sentence to each dictionary in the middle are accumulated;
s105, outputting candidate target word strings;
in this step, the candidate target word strings are output in order according to the length of the candidate target word strings.
Further, constructing the sentence frame element dictionary 102 specifically includes: each sentence in the question document is looked up, and when a word in the frame element dictionary 102 appears in the sentence, the sentence is collected in the frame element dictionary 102 to form a sentence frame element dictionary 102.
Sorting according to the length of the candidate target word string; the target word strings are sorted from small to large according to the length of the candidate target word strings, because in general, the longer word with the larger number of words has more definite semantics, and therefore, after the replacement, the shorter the whole target word string is, the more likely the whole target word string is to be the correct target word string. Such ordering may reduce the time a person finds the correct target word string. Outputting all candidate target word string files; and constructing a target word string sequence for each question sentence, and then recording the target word string sequence in a file for outputting for manual inspection.
The question-answer intention knowledge base 103 based on the question frame is constructed by the following steps:
when the method is used, an operator constructs a sentence frame element dictionary 102 according to the frame element dictionary 102 and a question document; looping the sentence frame element dictionary 102; circulating the existing sentence target word string set to form a new candidate target word string set, and keeping the sentence in the new candidate target word string set; replacing corresponding words in the target word string by using the sentence frame element dictionary 102, and updating the candidate target word string set; and sorting according to the length of the candidate target word string, and outputting the candidate target word string.
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include more than one of the feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise. Furthermore, the terms "mounted," "connected," and "connected" are to be construed broadly and may, for example, be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A question-answer intention knowledge base construction system based on a question frame is characterized by comprising the following steps:
the data layer comprises a question corpus, a frame element dictionary and a question-answer intention knowledge base; the system is used for storing files, reading and writing files and modifying files;
the processing layer comprises a frame element processing module and a question-answer sentence rewriting module and is used for rewriting sentences;
and the application layer comprises a question analysis module and is used for outputting candidate target word strings formed by the rewritten sentences.
2. The question-answer intention knowledge base construction system based on a question frame as claimed in claim 1, wherein the question corpus comprises sequence numbers, question sources and questions and is used for recording relevant information of the questions.
3. The question-answer intention knowledge base construction system based on a question sentence frame according to claim 2, wherein the format of the frame element dictionary comprises a frame name and a frame element code; the frame element dictionary comprises a question parsing layer, the question parsing layer comprises a first layer and a second layer, the first layer is used for sequence parsing, and the second layer is used for implication relation and hierarchical structure parsing.
4. The question-answer intention knowledge base construction system based on a question frame according to claim 3, wherein the question-answer intention knowledge base comprises question target word strings and question-answer intention analysis, the question-answer intention analysis comprises a first part and a second part, the first part is a name of the frame, and the second part is an answer sentence template.
5. The question-answer-intention knowledge base construction system based on a question-sentence frame as claimed in claim 1, wherein the frame element processing module is used for finding word strings from the frame element dictionary.
6. The question-answer intention knowledge base construction system based on a question frame according to claim 5, wherein the rewriting module is configured to perform character replacement on word strings in the sentences to complete the rewriting of the sentences, and each rewritten sentence is added as a new original sentence into a rewritten sentence set to be accumulated until all frame element character strings are used, so as to obtain a rewritten sentence set.
7. The question-answer intention knowledge base construction system based on a question sentence frame as claimed in claim 6, wherein the question sentence analysis module is configured to establish a read-in question sentence list, and perform reverse sorting output on frame target word strings formed by the rewritten sentences according to word string lengths.
8. A question-answer intention knowledge base construction method based on a question frame is characterized by specifically comprising the following steps:
s101, constructing a sentence frame element dictionary according to the frame element dictionary and the question document;
s102, circulating the sentence frame element dictionary;
s103, circulating the existing sentence target word string set to form a new candidate target word string set, and keeping the sentence in the new candidate target word string set;
s104, replacing corresponding words in the target word string with the sentence frame element dictionary, and updating the candidate target word string set;
and S105, sorting according to the length of the candidate target word string, and outputting the candidate target word string.
9. The question-answer intention knowledge base construction method based on a question sentence frame as claimed in claim 8, wherein constructing a sentence frame element dictionary specifically comprises: and searching each sentence in the question document, and collecting the sentences in the frame element dictionary to form a sentence frame element dictionary when the words in the frame element dictionary appear in the sentences.
CN202110040888.1A 2021-01-13 2021-01-13 Question and answer intention knowledge base construction system and method based on question frame Active CN112650846B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110040888.1A CN112650846B (en) 2021-01-13 2021-01-13 Question and answer intention knowledge base construction system and method based on question frame

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110040888.1A CN112650846B (en) 2021-01-13 2021-01-13 Question and answer intention knowledge base construction system and method based on question frame

Publications (2)

Publication Number Publication Date
CN112650846A true CN112650846A (en) 2021-04-13
CN112650846B CN112650846B (en) 2024-08-23

Family

ID=75367975

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110040888.1A Active CN112650846B (en) 2021-01-13 2021-01-13 Question and answer intention knowledge base construction system and method based on question frame

Country Status (1)

Country Link
CN (1) CN112650846B (en)

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101097573A (en) * 2006-06-28 2008-01-02 腾讯科技(深圳)有限公司 Automatically request-answering system and method
CN101470700A (en) * 2007-12-28 2009-07-01 日电(中国)有限公司 Text template generator, text generation equipment, text checking equipment and method thereof
CN101656800A (en) * 2008-08-20 2010-02-24 阿鲁策株式会社 Automatic answering device and method thereof, conversation scenario editing device, conversation server
CN103020040A (en) * 2011-09-27 2013-04-03 富士通株式会社 Rewriting processing method and equipment of source languages, and machine translation system
CN103827861A (en) * 2012-09-07 2014-05-28 株式会社东芝 Structured document management device, method, and program
US20140340311A1 (en) * 2013-05-17 2014-11-20 Leap Motion, Inc. Cursor mode switching
CN104715168A (en) * 2015-02-13 2015-06-17 陈佳阳 File security control and trace method and system based on digital fingerprints
CN107748757A (en) * 2017-09-21 2018-03-02 北京航空航天大学 A kind of answering method of knowledge based collection of illustrative plates
CN108073708A (en) * 2017-12-20 2018-05-25 北京百度网讯科技有限公司 Information output method and device
US20180181132A1 (en) * 2016-12-26 2018-06-28 Toyota Jidosha Kabushiki Kaisha Autonomous vehicle
CN109002432A (en) * 2017-06-07 2018-12-14 北京京东尚科信息技术有限公司 Method for digging and device, computer-readable medium, the electronic equipment of synonym
CN109408627A (en) * 2018-11-15 2019-03-01 众安信息技术服务有限公司 A kind of answering method and system merging convolutional neural networks and Recognition with Recurrent Neural Network
CN109791525A (en) * 2016-09-29 2019-05-21 株式会社东芝 Ac equipment, communication method and storage medium
CN110765277A (en) * 2019-10-22 2020-02-07 河海大学常州校区 Online equipment fault diagnosis platform of mobile terminal based on knowledge graph
CN111428483A (en) * 2020-03-31 2020-07-17 华为技术有限公司 Voice interaction method and device and terminal equipment
CN111708874A (en) * 2020-08-24 2020-09-25 湖南大学 Man-machine interaction question-answering method and system based on intelligent complex intention recognition
CN111930906A (en) * 2020-07-29 2020-11-13 北京北大软件工程股份有限公司 Knowledge graph question-answering method and device based on semantic block
CN111949774A (en) * 2020-07-08 2020-11-17 深圳鹏锐信息技术股份有限公司 Intelligent question answering method and system
CN112035610A (en) * 2020-08-28 2020-12-04 康键信息技术(深圳)有限公司 Medical field question and answer pair generation method and device, computer equipment and medium
CN112069298A (en) * 2020-07-31 2020-12-11 杭州远传新业科技有限公司 Human-computer interaction method, device and medium based on semantic web and intention recognition

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101097573A (en) * 2006-06-28 2008-01-02 腾讯科技(深圳)有限公司 Automatically request-answering system and method
CN101470700A (en) * 2007-12-28 2009-07-01 日电(中国)有限公司 Text template generator, text generation equipment, text checking equipment and method thereof
CN101656800A (en) * 2008-08-20 2010-02-24 阿鲁策株式会社 Automatic answering device and method thereof, conversation scenario editing device, conversation server
CN103020040A (en) * 2011-09-27 2013-04-03 富士通株式会社 Rewriting processing method and equipment of source languages, and machine translation system
CN103827861A (en) * 2012-09-07 2014-05-28 株式会社东芝 Structured document management device, method, and program
US20140340311A1 (en) * 2013-05-17 2014-11-20 Leap Motion, Inc. Cursor mode switching
CN104715168A (en) * 2015-02-13 2015-06-17 陈佳阳 File security control and trace method and system based on digital fingerprints
CN109791525A (en) * 2016-09-29 2019-05-21 株式会社东芝 Ac equipment, communication method and storage medium
US20180181132A1 (en) * 2016-12-26 2018-06-28 Toyota Jidosha Kabushiki Kaisha Autonomous vehicle
CN109002432A (en) * 2017-06-07 2018-12-14 北京京东尚科信息技术有限公司 Method for digging and device, computer-readable medium, the electronic equipment of synonym
CN107748757A (en) * 2017-09-21 2018-03-02 北京航空航天大学 A kind of answering method of knowledge based collection of illustrative plates
CN108073708A (en) * 2017-12-20 2018-05-25 北京百度网讯科技有限公司 Information output method and device
CN109408627A (en) * 2018-11-15 2019-03-01 众安信息技术服务有限公司 A kind of answering method and system merging convolutional neural networks and Recognition with Recurrent Neural Network
CN110765277A (en) * 2019-10-22 2020-02-07 河海大学常州校区 Online equipment fault diagnosis platform of mobile terminal based on knowledge graph
CN111428483A (en) * 2020-03-31 2020-07-17 华为技术有限公司 Voice interaction method and device and terminal equipment
CN111949774A (en) * 2020-07-08 2020-11-17 深圳鹏锐信息技术股份有限公司 Intelligent question answering method and system
CN111930906A (en) * 2020-07-29 2020-11-13 北京北大软件工程股份有限公司 Knowledge graph question-answering method and device based on semantic block
CN112069298A (en) * 2020-07-31 2020-12-11 杭州远传新业科技有限公司 Human-computer interaction method, device and medium based on semantic web and intention recognition
CN111708874A (en) * 2020-08-24 2020-09-25 湖南大学 Man-machine interaction question-answering method and system based on intelligent complex intention recognition
CN112035610A (en) * 2020-08-28 2020-12-04 康键信息技术(深圳)有限公司 Medical field question and answer pair generation method and device, computer equipment and medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
吴炎;王儒敬;: "基于BERT的语义匹配算法在问答系统中的应用", 仪表技术, no. 06, 15 June 2020 (2020-06-15), pages 23 - 26 *
曹志娟;李祖枢;刘朝涛;: "自动问答系统中的问题理解研究", 计算机科学, no. 11, 25 November 2005 (2005-11-25), pages 160 - 162 *
陈永鸿, 陈一秀: "基于书本知识的多媒体写作系统", 华侨大学学报(自然科学版), no. 01, 25 January 2000 (2000-01-25), pages 97 - 100 *

Also Published As

Publication number Publication date
CN112650846B (en) 2024-08-23

Similar Documents

Publication Publication Date Title
CN106570180B (en) Voice search method and device based on artificial intelligence
US11934781B2 (en) Systems and methods for controllable text summarization
CN108304375A (en) A kind of information identifying method and its equipment, storage medium, terminal
CN110276069B (en) Method, system and storage medium for automatically detecting Chinese braille error
CN111967242A (en) Text information extraction method, device and equipment
JPWO2007097208A1 (en) Language processing apparatus, language processing method, and language processing program
CN115034218A (en) Chinese grammar error diagnosis method based on multi-stage training and editing level voting
KR20230009564A (en) Learning data correction method and apparatus thereof using ensemble score
CN112395425A (en) Data processing method and device, computer equipment and readable storage medium
CN113449514A (en) Text error correction method and device suitable for specific vertical field
CN110633456B (en) Language identification method, language identification device, server and storage medium
CN114298048A (en) Named entity identification method and device
CN111782773B (en) Text matching method and device based on cascade mode
CN113128224B (en) Chinese error correction method, device, equipment and readable storage medium
CN109002454B (en) Method and electronic equipment for determining spelling partition of target word
CN112650846A (en) Question-answer intention knowledge base construction system and method based on question frame
CN116360794A (en) Database language analysis method, device, computer equipment and storage medium
CN113536776B (en) Method for generating confusion statement, terminal device and computer readable storage medium
CN115831117A (en) Entity identification method, entity identification device, computer equipment and storage medium
CN114492469A (en) Translation method, translation device and computer readable storage medium
CN111428475A (en) Word segmentation word bank construction method, word segmentation method, device and storage medium
CN111126082A (en) Translation method and device
Vidra Morphological segmentation of Czech words
CN114626365B (en) Method, device, equipment and storage medium for determining defects of composition error correction model
CN114048321B (en) Multi-granularity text error correction data set generation method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant