CN114238715A - Question-answering system based on social aid, construction method, computer equipment and medium - Google Patents

Question-answering system based on social aid, construction method, computer equipment and medium Download PDF

Info

Publication number
CN114238715A
CN114238715A CN202111465276.3A CN202111465276A CN114238715A CN 114238715 A CN114238715 A CN 114238715A CN 202111465276 A CN202111465276 A CN 202111465276A CN 114238715 A CN114238715 A CN 114238715A
Authority
CN
China
Prior art keywords
social
word segmentation
data
triple
key information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111465276.3A
Other languages
Chinese (zh)
Inventor
莫东序
夏状
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Sida Digital Technology Co ltd
Original Assignee
Guangxi Sida Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi Sida Digital Technology Co ltd filed Critical Guangxi Sida Digital Technology Co ltd
Priority to CN202111465276.3A priority Critical patent/CN114238715A/en
Publication of CN114238715A publication Critical patent/CN114238715A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The scheme relates to a question-answering system based on social aid. The system comprises: the rule determining module is used for acquiring social rescue regulation data and social rescue problem data and determining information extraction rules and word segmentation rules; the key information extraction module is used for extracting reply key information from the social rescue provision data according to the information extraction rule and extracting question key information from the social rescue question data according to the word segmentation rule; the knowledge graph generating module is used for constructing a reply triple and a knowledge graph according to the reply key information; the problem triple construction module is used for obtaining a problem triple according to the problem key information and converting the problem triple into a query statement; and the natural semantic obtaining module is used for searching the target response triple corresponding to the query sentence in the knowledge graph and optimizing the target response triple to obtain corresponding natural semantics. Reply triples, knowledge maps and question triples are constructed, so that the manual question and answer pressure is reduced, and a user can obtain accurate replies in time.

Description

Question-answering system based on social aid, construction method, computer equipment and medium
Technical Field
The invention relates to the technical field of intelligent question answering, in particular to a question answering system based on social aid, a construction method, computer equipment and a storage medium.
Background
The social aid refers to a system for providing property relief and life support for citizens who fall into a survival predicament due to various reasons in China and society so as to guarantee the minimum living needs of the citizens. When a citizen seeks social aid, a lot of questions are usually required to be answered by staff, the current policy introduction and consultation of social aid still depends on manual question and answer, and the citizen is large in workload and high in repeatability.
Therefore, the traditional social aid mode has the problems that the user cannot answer the questions in time and the pressure of manual question answering is high.
Disclosure of Invention
Based on this, in order to solve the above technical problems, a question-answering system, a method, a computer device and a storage medium based on social aid are provided, which can facilitate users to obtain responses related to social aid policies in time and can reduce the pressure of manual question-answering in social aid.
A social aid based question-answering system, the system comprising:
the rule determining module is used for acquiring social rescue regulation data and social rescue problem data, determining an information extraction rule according to the social rescue regulation data, and determining a word segmentation rule according to the social rescue problem data;
the key information extraction module is used for carrying out data processing on the social rescue specified data to obtain social rescue text contents, extracting reply key information from the social rescue text contents according to the information extraction rule, and extracting question key information from the social rescue question data according to the word segmentation rule;
the knowledge graph generating module is used for acquiring the predefined three-tuple types and contents, and constructing a reply triple according to the reply key information by using the predefined three-tuple types and contents to generate a knowledge graph;
the problem triple construction module is used for carrying out dependency syntax analysis on the problem key information to obtain a problem triple and converting the problem triple into a query statement;
and the natural semantic obtaining module is used for searching the target reply triple corresponding to the query sentence in the knowledge graph and optimizing the target reply triple to obtain the natural semantic corresponding to the query sentence.
In one embodiment, the key information extraction module is further configured to: automatically capturing text data of the social rescue regulation data by adopting a Scapy network crawler frame; and processing the text data into a picture or a PDF format file by using an OCR text recognition technology, and extracting social aid text content in the picture or the PDF format file by using a Faster R-CNN algorithm.
In one embodiment, the key information extraction module is further configured to: determining the maximum length of words in the social aid problem data; performing word segmentation on the sentence according to the maximum length by utilizing a forward maximum matching algorithm to obtain a first word segmentation result; segmenting words of the sentence according to the maximum length by utilizing a reverse maximum matching algorithm to obtain a second word segmentation result; and extracting the key problem information in the social rescue problem data according to the first word segmentation result and the second word segmentation result, and classifying the social rescue problem data.
In one embodiment, the key information extraction module is further configured to: performing word segmentation matching on the sentence from left to right according to the maximum length by using a forward maximum matching algorithm, and obtaining the first word segmentation result if the matching is successful; if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain the first segmentation result; performing word segmentation matching on the sentences from right to left by utilizing a reverse maximum matching algorithm according to the maximum length, and if the matching is successful, obtaining a second word segmentation result; and if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain the second word segmentation result.
In one embodiment, the key information extraction module is further configured to: and comparing the first word segmentation result with the second word segmentation result, and returning the first word segmentation result or the second word segmentation result as the key information of the problem when the first word segmentation result is the same as the second word segmentation result.
In one embodiment, the problem triplet building module is further configured to: converting the problem key information into corresponding character vectors, and extracting characteristic information of the character vectors; obtaining a global optimal sequence according to the feature vector information, and carrying out entity identification naming on each character vector according to the global optimal sequence; and carrying out dependency syntax analysis on the character vectors after the entity identification and naming to obtain a problem triple.
A method for creating a question-answering system based on social assistance, the method comprising:
acquiring social rescue regulation data and social rescue problem data, determining an information extraction rule according to the social rescue regulation data, and determining a word segmentation rule according to the social rescue problem data;
performing data processing on the social rescue specified data to obtain social rescue text contents, extracting reply key information from the social rescue text contents according to the information extraction rules, and extracting question key information from the social rescue question data according to the word segmentation rules;
acquiring a predefined three-tuple type and content, and constructing a reply triple according to the reply key information by using the predefined three-tuple type and content to generate a knowledge graph;
performing dependency syntax analysis on the problem key information to obtain a problem triple, and converting the problem triple into a query statement;
and searching a target answer triple corresponding to the query statement in the knowledge graph, and optimizing the target answer triple to obtain natural semantics corresponding to the query statement.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring social rescue regulation data and social rescue problem data, determining an information extraction rule according to the social rescue regulation data, and determining a word segmentation rule according to the social rescue problem data;
performing data processing on the social rescue specified data to obtain social rescue text contents, extracting reply key information from the social rescue text contents according to the information extraction rules, and extracting question key information from the social rescue question data according to the word segmentation rules;
acquiring a predefined three-tuple type and content, and constructing a reply triple according to the reply key information by using the predefined three-tuple type and content to generate a knowledge graph;
performing dependency syntax analysis on the problem key information to obtain a problem triple, and converting the problem triple into a query statement;
and searching a target answer triple corresponding to the query statement in the knowledge graph, and optimizing the target answer triple to obtain natural semantics corresponding to the query statement.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring social rescue regulation data and social rescue problem data, determining an information extraction rule according to the social rescue regulation data, and determining a word segmentation rule according to the social rescue problem data;
performing data processing on the social rescue specified data to obtain social rescue text contents, extracting reply key information from the social rescue text contents according to the information extraction rules, and extracting question key information from the social rescue question data according to the word segmentation rules;
acquiring a predefined three-tuple type and content, and constructing a reply triple according to the reply key information by using the predefined three-tuple type and content to generate a knowledge graph;
performing dependency syntax analysis on the problem key information to obtain a problem triple, and converting the problem triple into a query statement;
and searching a target answer triple corresponding to the query statement in the knowledge graph, and optimizing the target answer triple to obtain natural semantics corresponding to the query statement.
According to the question-answering system based on social rescue, the construction method, the computer equipment and the storage medium, the social rescue regulation data and the social rescue problem data are obtained, the information extraction rule is determined according to the social rescue regulation data, and the word segmentation rule is determined according to the social rescue problem data; performing data processing on the social rescue specified data to obtain social rescue text contents, extracting reply key information from the social rescue text contents according to the information extraction rules, and extracting question key information from the social rescue question data according to the word segmentation rules; acquiring a predefined three-tuple type and content, and constructing a reply triple according to the reply key information by using the predefined three-tuple type and content to generate a knowledge graph; performing dependency syntax analysis on the problem key information to obtain a problem triple, and converting the problem triple into a query statement; and searching a target answer triple corresponding to the query statement in the knowledge graph, and optimizing the target answer triple to obtain natural semantics corresponding to the query statement. Because the knowledge graph is generated by constructing the response triples and the question triples are constructed, the artificial question and answer pressure is reduced by an intelligent means, and a user can conveniently obtain accurate responses in time.
Drawings
FIG. 1 is a diagram of an embodiment of an application environment of a method for constructing a social aid based question-answering system;
FIG. 2 is a block diagram of a social aid based question and answer system in one embodiment;
FIG. 3 is a schematic flow chart illustrating a method for constructing a social aid based question-answering system according to an embodiment;
FIG. 4 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The construction method of the question-answering system based on social assistance provided by the embodiment of the application can be applied to the application environment shown in fig. 1. As shown in FIG. 1, the application environment includes a computer device 110. The computer device 110 may obtain social aid prescription data and social aid problem data, determine an information extraction rule according to the social aid prescription data, and determine a word segmentation rule according to the social aid problem data; the computer device 110 may perform data processing on the social rescue provision data to obtain social rescue text content, extract reply key information from the social rescue text content according to the information extraction rule, and extract question key information from the social rescue question data according to the word segmentation rule; the computer device 110 may obtain predefined triple types and contents, and construct a response triple generating knowledge graph according to the response key information by using the predefined triple types and contents; the computer device 110 may perform dependency parsing on the problem key information to obtain a problem triple, and convert the problem triple into a query statement; the computer device 110 may look up target response triples in the knowledge-graph corresponding to the query statement and optimize the target response triples for natural semantics corresponding to the query statement. The computer device 110 may be, but is not limited to, various personal computers, notebook computers, smart phones, robots, tablet computers, and other devices.
In one embodiment, as shown in fig. 2, there is provided a social aid based question-answering system, comprising: a rule determining module 210, a key information extracting module 220, a knowledge graph generating module 230, a problem triple constructing module 240 and a natural semantics obtaining module 250, wherein:
the rule determining module 210 is configured to obtain social rescue regulation data and social rescue problem data, determine an information extraction rule according to the social rescue regulation data, and determine a word segmentation rule according to the social rescue problem data.
The rule determining module 210 may obtain social rescue regulation data and social rescue problem data, where the social rescue regulation data may be social rescue related policy and regulations; social aid problem data may be a common problem that arises in social aids that are collected in advance. The rule determining module 210 may obtain social rescue regulation data and social rescue problem data uploaded by a worker, and define an extraction rule of key information, that is, an information extraction rule, according to the social rescue regulation data; meanwhile, the rule determining module 210 may determine the word segmentation rule according to the social aid problem data.
The key information extraction module 220 is configured to perform data processing on the social rescue provision data to obtain social rescue text content, extract reply key information from the social rescue text content according to the information extraction rule, and extract question key information from the social rescue question data according to the word segmentation rule.
The key information extraction module 220 may extract the reply key information and the question key information, specifically, the key information extraction module 220 may perform data text recognition processing on the social rescue regulation data to obtain the social rescue text content, and since the social rescue regulation data may be social rescue related policy and regulations, the key information extraction module 220 may extract the reply key information of the social rescue text content according to a predefined key information extraction rule. Similarly, the social aid problem data may be a common problem occurring in social aid, and the key information extraction module 220 performs data extraction according to the word segmentation rule to obtain the problem key information in the social aid problem data.
And the knowledge graph generating module 230 is configured to obtain predefined triple types and content, and construct a response triple according to the response key information by using the predefined triple types and content to generate a knowledge graph.
Wherein the predefined triplet type and content may be predefined and stored in the computer device. The knowledge graph generation module 230 may acquire the predefined triple types and contents, construct a reply triple based on the extracted reply key information by using the predefined triple types and contents, convert the reply triple into a Cypher language format, and store the reply triple in the Neo4j database to complete the construction of the social assistance knowledge graph.
And the problem triple construction module 240 is configured to perform dependency syntax analysis on the problem key information to obtain a problem triple, and convert the problem triple into a query statement.
The problem triple construction module 240 may perform word segmentation on the problem key information and perform dependency syntax analysis, thereby obtaining a problem triple, and then the problem triple construction module 240 may convert the problem triple into a corresponding Cypher query statement.
And a natural semantic obtaining module 250, configured to search the target response triple corresponding to the query statement in the knowledge graph, and optimize the target response triple to obtain a natural semantic corresponding to the query statement.
After obtaining the query statement, natural semantics obtaining module 250 may search for the target response triple in the knowledge-graph. Wherein the searched target reply triple may correspond to the query statement. The natural semantics acquiring module 250 may optimize the target reply triple and feed back the optimized target reply triple to the user, where the fed back data may be natural semantics corresponding to the query statement.
In this embodiment, the rule determining module 210 is configured to obtain social rescue regulation data and social rescue problem data, determine an information extraction rule according to the social rescue regulation data, and determine a word segmentation rule according to the social rescue problem data; the key information extraction module 220 is configured to perform data processing on the social rescue provision data to obtain social rescue text content, extract reply key information from the social rescue text content according to the information extraction rule, and extract question key information from the social rescue question data according to the word segmentation rule; a knowledge graph generating module 230, configured to obtain predefined triple types and contents, and construct a response triple according to the response key information by using the predefined triple types and contents to generate a knowledge graph; the problem triple construction module 240 is configured to perform dependency syntax analysis on the problem key information to obtain a problem triple, and convert the problem triple into a query statement; and a natural semantic obtaining module 250, configured to search the target response triple corresponding to the query statement in the knowledge graph, and optimize the target response triple to obtain a natural semantic corresponding to the query statement. Because the knowledge graph is generated by constructing the response triples and the question triples are constructed, the artificial question and answer pressure is reduced by an intelligent means, and a user can conveniently obtain accurate responses in time.
In one embodiment, the key information extraction module 220 is further configured to: automatically capturing text data of social rescue regulation data by adopting a Scapy web crawler frame; and processing the text data into a picture or a PDF format file by using an OCR text recognition technology, and extracting social aid text content in the picture or the PDF format file by using a fast R-CNN algorithm.
The key information extraction module 220 can create a script project and make the crawler crawling content clear, so that crawling operation is performed according to a preprogrammed Spider, text data of social rescue regulation data is captured, the captured text data is stored in mongodb, and the script project is deployed. Next, the key information extraction module 220 may process the text data into a picture or a PDF format file by using an OCR text recognition technology, extract social rescue text contents in the picture or PDF format file by using a fast R-CNN algorithm, and further extract reply key information in the social rescue text contents by using a regular expression.
In one embodiment, the key information extraction module 220 is further configured to: determining the maximum length of words in the social aid problem data; performing word segmentation on the sentence according to the maximum length by utilizing a forward maximum matching algorithm to obtain a first word segmentation result; segmenting words of the sentence according to the maximum length by utilizing a reverse maximum matching algorithm to obtain a second word segmentation result; and extracting the key problem information in the social rescue problem data according to the first word segmentation result and the second word segmentation result, and classifying the social rescue problem data.
The key information extraction module 220 may use a bidirectional maximum matching algorithm to perform word segmentation on the social aid common problems, extract the key information of the problems according to the word segmentation results, and classify question categories. Specifically, the key information extraction module 220 may determine the maximum length of a word in the social aid common problem, further perform word segmentation on the sentence by using a forward maximum matching algorithm and a reverse maximum matching algorithm, and correspondingly obtain a first word segmentation result and a second word segmentation result. The key information extraction module 220 may extract the key information of the questions in the social rescue question data according to the first and second segmentation results, and classify the social rescue question data, so that the questions may be quickly queried in the knowledge graph to obtain answers.
In one embodiment, the key information extraction module 220 is further configured to: performing word segmentation matching on the sentence from left to right by utilizing a forward maximum matching algorithm according to the maximum length, and obtaining a first word segmentation result if the matching is successful; if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain a first word segmentation result; performing word segmentation matching on the sentences from right to left by utilizing a reverse maximum matching algorithm according to the maximum length, and obtaining a second word segmentation result if the matching is successful; and if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain a second word segmentation result.
The key information extraction module 220 may perform word segmentation on the sentence from left to right by using a forward maximum matching algorithm, and if the matching is successful, the word segmentation is finished to obtain a first word segmentation result; if the matching is not successful, the length of the matched word is reduced by 1 until all the words are divided. Similarly, the key information extraction module 220 may use a reverse maximum matching algorithm to perform word segmentation on the sentence from right to left, and if the matching is successful, the word segmentation is finished to obtain a second word segmentation result, and if the matching is unsuccessful, the length of the matched word is reduced by 1 until all the words are completely segmented.
In one embodiment, the key information extraction module 220 is further configured to: and comparing the first word segmentation result with the second word segmentation result, and returning the first word segmentation result or the second word segmentation result as the key information of the problem when the first word segmentation result is the same as the second word segmentation result.
The key information extraction module 220 may determine whether the first segmentation result is the same as the second segmentation result, and if so, return any segmentation result as the key information of the problem; if the number of the participles is different, the result with less participles is returned as key information of the problem.
In one embodiment, the problem triplet building module 240 is further configured to: converting the key problem information into corresponding character vectors, and extracting characteristic information of the character vectors; obtaining a global optimal sequence according to the feature vector information, and carrying out entity identification naming on each character vector according to the global optimal sequence; and carrying out dependency syntax analysis on the character vectors after the entity identification and naming to obtain a problem triple.
The problem triple construction module 240 may convert the words in the problem key information into corresponding word vector representations by using a BERT model, and extract feature information thereof. Further, the problem triple construction module 240 may further extract feature information of the text context of the problem key information by using a BiLSTM model, and obtain a global optimal sequence by using a CRF model, thereby implementing named entity identification. The problem triple construction module 240 may perform dependency syntax analysis through the LTP-parser to obtain a problem triple. The dependency syntax relationship categories mainly include: a predicate relationship, a predicate with prefix, a predicate with core, a parallel relationship, a predicate relationship with intermediary, an independent relationship, a predicate, a shape, and the like.
As shown in fig. 3, in an embodiment, a method for creating a question-answering system based on social assistance is provided, which includes the following specific steps:
step 302, acquiring social rescue regulation data and social rescue problem data, determining an information extraction rule according to the social rescue regulation data, and determining a word segmentation rule according to the social rescue problem data;
step 304, performing data processing on the social rescue regulation data to obtain social rescue text contents, extracting reply key information from the social rescue text contents according to the information extraction rules, and extracting question key information from the social rescue question data according to the word segmentation rules;
step 306, obtaining predefined three-tuple types and contents, and constructing a reply triple to generate a knowledge graph according to the reply key information by using the predefined three-tuple types and contents;
step 308, performing dependency syntax analysis on the problem key information to obtain a problem triple, and converting the problem triple into a query statement;
and 310, searching a target reply triple corresponding to the query sentence in the knowledge graph, and optimizing the target reply triple to obtain natural semantics corresponding to the query sentence.
In one embodiment, the method for creating a social aid-based question-answering system may further include a process of obtaining social aid text content, where the process includes: automatically capturing text data of social rescue regulation data by adopting a Scapy web crawler frame; and processing the text data into a picture or a PDF format file by using an OCR text recognition technology, and extracting social aid text content in the picture or the PDF format file by using a fast R-CNN algorithm.
In one embodiment, the method for creating a social aid-based question-answering system may further include a process of extracting key information, where the specific process includes: determining the maximum length of words in the social aid problem data; performing word segmentation on the sentence according to the maximum length by utilizing a forward maximum matching algorithm to obtain a first word segmentation result; segmenting words of the sentence according to the maximum length by utilizing a reverse maximum matching algorithm to obtain a second word segmentation result; and extracting the key problem information in the social rescue problem data according to the first word segmentation result and the second word segmentation result, and classifying the social rescue problem data.
In one embodiment, the method for creating a social aid-based question-answering system may further include a process of obtaining a word segmentation result, where the process includes: performing word segmentation matching on the sentence from left to right by utilizing a forward maximum matching algorithm according to the maximum length, and obtaining a first word segmentation result if the matching is successful; if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain a first word segmentation result; performing word segmentation matching on the sentences from right to left by utilizing a reverse maximum matching algorithm according to the maximum length, and obtaining a second word segmentation result if the matching is successful; and if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain a second word segmentation result.
In one embodiment, the method for creating a social aid-based question-answering system may further include a process of determining key information of a question, where the specific process includes: and comparing the first word segmentation result with the second word segmentation result, and returning the first word segmentation result or the second word segmentation result as the key information of the problem when the first word segmentation result is the same as the second word segmentation result.
In one embodiment, the method for creating a social aid-based question-answering system may further include a process of constructing a question triplet, where the process includes: converting the key problem information into corresponding character vectors, and extracting characteristic information of the character vectors; obtaining a global optimal sequence according to the feature vector information, and carrying out entity identification naming on each character vector according to the global optimal sequence; and carrying out dependency syntax analysis on the character vectors after the entity identification and naming to obtain a problem triple.
It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the above-described flowcharts may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or the stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 4. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of creating a social aid based question-answering system. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 4 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
acquiring social rescue regulation data and social rescue problem data, determining an information extraction rule according to the social rescue regulation data, and determining a word segmentation rule according to the social rescue problem data;
performing data processing on the social rescue specified data to obtain social rescue text contents, extracting reply key information from the social rescue text contents according to information extraction rules, and extracting question key information from the social rescue question data according to word segmentation rules;
acquiring a predefined three-tuple type and content, and constructing a reply triple according to reply key information by using the predefined three-tuple type and content to generate a knowledge graph;
performing dependency syntax analysis on the problem key information to obtain a problem triple, and converting the problem triple into a query statement;
and searching a target reply triple corresponding to the query sentence in the knowledge graph, and optimizing the target reply triple to obtain natural semantics corresponding to the query sentence.
In one embodiment, the processor, when executing the computer program, further performs the steps of: automatically capturing text data of social rescue regulation data by adopting a Scapy web crawler frame; and processing the text data into a picture or a PDF format file by using an OCR text recognition technology, and extracting social aid text content in the picture or the PDF format file by using a fast R-CNN algorithm.
In one embodiment, the processor, when executing the computer program, further performs the steps of: determining the maximum length of words in the social aid problem data; performing word segmentation on the sentence according to the maximum length by utilizing a forward maximum matching algorithm to obtain a first word segmentation result; segmenting words of the sentence according to the maximum length by utilizing a reverse maximum matching algorithm to obtain a second word segmentation result; and extracting the key problem information in the social rescue problem data according to the first word segmentation result and the second word segmentation result, and classifying the social rescue problem data.
In one embodiment, the processor, when executing the computer program, further performs the steps of: performing word segmentation matching on the sentence from left to right by utilizing a forward maximum matching algorithm according to the maximum length, and obtaining a first word segmentation result if the matching is successful; if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain a first word segmentation result; performing word segmentation matching on the sentences from right to left by utilizing a reverse maximum matching algorithm according to the maximum length, and obtaining a second word segmentation result if the matching is successful; and if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain a second word segmentation result.
In one embodiment, the processor, when executing the computer program, further performs the steps of: and comparing the first word segmentation result with the second word segmentation result, and returning the first word segmentation result or the second word segmentation result as the key information of the problem when the first word segmentation result is the same as the second word segmentation result.
In one embodiment, the processor, when executing the computer program, further performs the steps of: converting the key problem information into corresponding character vectors, and extracting characteristic information of the character vectors; obtaining a global optimal sequence according to the feature vector information, and carrying out entity identification naming on each character vector according to the global optimal sequence; and carrying out dependency syntax analysis on the character vectors after the entity identification and naming to obtain a problem triple.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
acquiring social rescue regulation data and social rescue problem data, determining an information extraction rule according to the social rescue regulation data, and determining a word segmentation rule according to the social rescue problem data;
performing data processing on the social rescue specified data to obtain social rescue text contents, extracting reply key information from the social rescue text contents according to information extraction rules, and extracting question key information from the social rescue question data according to word segmentation rules;
acquiring a predefined three-tuple type and content, and constructing a reply triple according to reply key information by using the predefined three-tuple type and content to generate a knowledge graph;
performing dependency syntax analysis on the problem key information to obtain a problem triple, and converting the problem triple into a query statement;
and searching a target reply triple corresponding to the query sentence in the knowledge graph, and optimizing the target reply triple to obtain natural semantics corresponding to the query sentence.
In one embodiment, the computer program when executed by the processor further performs the steps of: automatically capturing text data of social rescue regulation data by adopting a Scapy web crawler frame; and processing the text data into a picture or a PDF format file by using an OCR text recognition technology, and extracting social aid text content in the picture or the PDF format file by using a fast R-CNN algorithm.
In one embodiment, the computer program when executed by the processor further performs the steps of: determining the maximum length of words in the social aid problem data; performing word segmentation on the sentence according to the maximum length by utilizing a forward maximum matching algorithm to obtain a first word segmentation result; segmenting words of the sentence according to the maximum length by utilizing a reverse maximum matching algorithm to obtain a second word segmentation result; and extracting the key problem information in the social rescue problem data according to the first word segmentation result and the second word segmentation result, and classifying the social rescue problem data.
In one embodiment, the computer program when executed by the processor further performs the steps of: performing word segmentation matching on the sentence from left to right by utilizing a forward maximum matching algorithm according to the maximum length, and obtaining a first word segmentation result if the matching is successful; if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain a first word segmentation result; performing word segmentation matching on the sentences from right to left by utilizing a reverse maximum matching algorithm according to the maximum length, and obtaining a second word segmentation result if the matching is successful; and if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain a second word segmentation result.
In one embodiment, the computer program when executed by the processor further performs the steps of: and comparing the first word segmentation result with the second word segmentation result, and returning the first word segmentation result or the second word segmentation result as the key information of the problem when the first word segmentation result is the same as the second word segmentation result.
In one embodiment, the computer program when executed by the processor further performs the steps of: converting the key problem information into corresponding character vectors, and extracting characteristic information of the character vectors; obtaining a global optimal sequence according to the feature vector information, and carrying out entity identification naming on each character vector according to the global optimal sequence; and carrying out dependency syntax analysis on the character vectors after the entity identification and naming to obtain a problem triple.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A social aid based question-answering system, the system comprising:
the rule determining module is used for acquiring social rescue regulation data and social rescue problem data, determining an information extraction rule according to the social rescue regulation data, and determining a word segmentation rule according to the social rescue problem data;
the key information extraction module is used for carrying out data processing on the social rescue specified data to obtain social rescue text contents, extracting reply key information from the social rescue text contents according to the information extraction rule, and extracting question key information from the social rescue question data according to the word segmentation rule;
the knowledge graph generating module is used for acquiring the predefined three-tuple types and contents, and constructing a reply triple according to the reply key information by using the predefined three-tuple types and contents to generate a knowledge graph;
the problem triple construction module is used for carrying out dependency syntax analysis on the problem key information to obtain a problem triple and converting the problem triple into a query statement;
and the natural semantic obtaining module is used for searching the target reply triple corresponding to the query sentence in the knowledge graph and optimizing the target reply triple to obtain the natural semantic corresponding to the query sentence.
2. The social aid-based question-answering system according to claim 1, wherein the key information extraction module is further configured to: automatically capturing text data of the social rescue regulation data by adopting a Scapy network crawler frame; and processing the text data into a picture or a PDF format file by using an OCR text recognition technology, and extracting social aid text content in the picture or the PDF format file by using a Faster R-CNN algorithm.
3. The social aid-based question-answering system according to claim 1, wherein the key information extraction module is further configured to: determining the maximum length of words in the social aid problem data; performing word segmentation on the sentence according to the maximum length by utilizing a forward maximum matching algorithm to obtain a first word segmentation result; segmenting words of the sentence according to the maximum length by utilizing a reverse maximum matching algorithm to obtain a second word segmentation result; and extracting the key problem information in the social rescue problem data according to the first word segmentation result and the second word segmentation result, and classifying the social rescue problem data.
4. The social aid-based question-answering system according to claim 3, wherein the key information extraction module is further configured to: performing word segmentation matching on the sentence from left to right according to the maximum length by using a forward maximum matching algorithm, and obtaining the first word segmentation result if the matching is successful; if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain the first segmentation result; performing word segmentation matching on the sentences from right to left by utilizing a reverse maximum matching algorithm according to the maximum length, and if the matching is successful, obtaining a second word segmentation result; and if the matching is unsuccessful, reducing the length of the matched word until the matching is successful to obtain the second word segmentation result.
5. The social aid-based question-answering system according to claim 3, wherein the key information extraction module is further configured to: and comparing the first word segmentation result with the second word segmentation result, and returning the first word segmentation result or the second word segmentation result as the key information of the problem when the first word segmentation result is the same as the second word segmentation result.
6. The social aid based question-answering system according to claim 1, wherein the question triplet construction module is further configured to: converting the problem key information into corresponding character vectors, and extracting characteristic information of the character vectors; obtaining a global optimal sequence according to the feature vector information, and carrying out entity identification naming on each character vector according to the global optimal sequence; and carrying out dependency syntax analysis on the character vectors after the entity identification and naming to obtain a problem triple.
7. A method for creating a question-answering system based on social aid is characterized by comprising the following steps:
acquiring social rescue regulation data and social rescue problem data, determining an information extraction rule according to the social rescue regulation data, and determining a word segmentation rule according to the social rescue problem data;
performing data processing on the social rescue specified data to obtain social rescue text contents, extracting reply key information from the social rescue text contents according to the information extraction rules, and extracting question key information from the social rescue question data according to the word segmentation rules;
acquiring a predefined three-tuple type and content, and constructing a reply triple according to the reply key information by using the predefined three-tuple type and content to generate a knowledge graph;
performing dependency syntax analysis on the problem key information to obtain a problem triple, and converting the problem triple into a query statement;
and searching a target answer triple corresponding to the query statement in the knowledge graph, and optimizing the target answer triple to obtain natural semantics corresponding to the query statement.
8. The method for creating a question-answering system based on social aid according to claim 7, wherein the step of performing data processing on the social aid provision data to obtain social aid text contents comprises:
automatically capturing text data of the social rescue regulation data by adopting a Scapy network crawler frame;
and processing the text data into a picture or a PDF format file by using an OCR text recognition technology, and extracting social aid text content in the picture or the PDF format file by using a Faster R-CNN algorithm.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the method for creating a social aid based question-answering system correspondence according to any one of claims 1 to 6 when executing the computer program.
10. A computer-readable storage medium on which a computer program is stored, the computer program, when being executed by a processor, implementing the method for creating a social aid-based question-answering system correspondence of any one of claims 1 to 6.
CN202111465276.3A 2021-12-03 2021-12-03 Question-answering system based on social aid, construction method, computer equipment and medium Pending CN114238715A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111465276.3A CN114238715A (en) 2021-12-03 2021-12-03 Question-answering system based on social aid, construction method, computer equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111465276.3A CN114238715A (en) 2021-12-03 2021-12-03 Question-answering system based on social aid, construction method, computer equipment and medium

Publications (1)

Publication Number Publication Date
CN114238715A true CN114238715A (en) 2022-03-25

Family

ID=80752899

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111465276.3A Pending CN114238715A (en) 2021-12-03 2021-12-03 Question-answering system based on social aid, construction method, computer equipment and medium

Country Status (1)

Country Link
CN (1) CN114238715A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117725187A (en) * 2024-02-08 2024-03-19 人和数智科技有限公司 Question-answering system suitable for social assistance

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117725187A (en) * 2024-02-08 2024-03-19 人和数智科技有限公司 Question-answering system suitable for social assistance
CN117725187B (en) * 2024-02-08 2024-04-30 人和数智科技有限公司 Question-answering system suitable for social assistance

Similar Documents

Publication Publication Date Title
CN109446302B (en) Question-answer data processing method and device based on machine learning and computer equipment
CN110457431B (en) Knowledge graph-based question and answer method and device, computer equipment and storage medium
US20220188521A1 (en) Artificial intelligence-based named entity recognition method and apparatus, and electronic device
CN108595695B (en) Data processing method, data processing device, computer equipment and storage medium
CN110765265B (en) Information classification extraction method and device, computer equipment and storage medium
CN109815333B (en) Information acquisition method and device, computer equipment and storage medium
CN108427707B (en) Man-machine question and answer method, device, computer equipment and storage medium
WO2020077896A1 (en) Method and apparatus for generating question data, computer device, and storage medium
CN108664595B (en) Domain knowledge base construction method and device, computer equipment and storage medium
CN112015900B (en) Medical attribute knowledge graph construction method, device, equipment and medium
CN113157863A (en) Question and answer data processing method and device, computer equipment and storage medium
CN110442697B (en) Man-machine interaction method, system, computer equipment and storage medium
CN111177405A (en) Data search matching method and device, computer equipment and storage medium
CN112035611B (en) Target user recommendation method, device, computer equipment and storage medium
WO2021082086A1 (en) Machine reading method, system, device, and storage medium
CN112287069A (en) Information retrieval method and device based on voice semantics and computer equipment
CN114139551A (en) Method and device for training intention recognition model and method and device for recognizing intention
CN112766319A (en) Dialogue intention recognition model training method and device, computer equipment and medium
CN113704420A (en) Method and device for identifying role in text, electronic equipment and storage medium
CN115455169A (en) Knowledge graph question-answering method and system based on vocabulary knowledge and semantic dependence
CN113342944B (en) Corpus generalization method, apparatus, device and storage medium
CN114238715A (en) Question-answering system based on social aid, construction method, computer equipment and medium
CN112989829B (en) Named entity recognition method, device, equipment and storage medium
CN117520590A (en) Ocean cross-modal image-text retrieval method, system, equipment and storage medium
CN112037904A (en) Online diagnosis and treatment data processing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination