CN118093834B - AIGC large model-based language processing question-answering system and method - Google Patents
AIGC large model-based language processing question-answering system and method Download PDFInfo
- Publication number
- CN118093834B CN118093834B CN202410479542.5A CN202410479542A CN118093834B CN 118093834 B CN118093834 B CN 118093834B CN 202410479542 A CN202410479542 A CN 202410479542A CN 118093834 B CN118093834 B CN 118093834B
- Authority
- CN
- China
- Prior art keywords
- domain
- aigc
- answer
- question
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012545 processing Methods 0.000 title claims abstract description 51
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000004458 analytical method Methods 0.000 claims abstract description 17
- 238000005516 engineering process Methods 0.000 claims abstract description 16
- 239000013598 vector Substances 0.000 claims description 29
- 230000006870 function Effects 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 15
- 230000007246 mechanism Effects 0.000 claims description 11
- 238000003058 natural language processing Methods 0.000 claims description 11
- 238000012795 verification Methods 0.000 claims description 11
- 238000013507 mapping Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 7
- 230000010354 integration Effects 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 230000003044 adaptive effect Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000005259 measurement Methods 0.000 claims description 6
- 238000013526 transfer learning Methods 0.000 claims description 6
- 238000009966 trimming Methods 0.000 claims description 6
- 238000010845 search algorithm Methods 0.000 claims description 5
- 230000009471 action Effects 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 239000000470 constituent Substances 0.000 claims description 3
- 238000000354 decomposition reaction Methods 0.000 claims description 3
- 238000013136 deep learning model Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000012804 iterative process Methods 0.000 claims description 3
- 230000008520 organization Effects 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 2
- 239000003550 marker Substances 0.000 claims 1
- 230000002787 reinforcement Effects 0.000 abstract description 2
- 230000003993 interaction Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to the technical field of language processing, in particular to a AIGC large model-based language processing question-answering system and method, comprising the following steps: receiving natural language questions input by a user, and extracting key information through grammar analysis and semantic understanding technology; inputting the extracted key information into a AIGC-based language model, and generating a series of answer candidates by using a AIGC large model according to the input information and the enhanced domain knowledge through domain adaptability enhancement processing; evaluating answer candidates to select an optimal answer; and outputting the optimal answer to the user in the form of natural language. The invention obviously enhances the adaptability and the processing capacity of AIGC large models to the problems in the specific field. The adaptability reinforcement not only improves the application range of the question-answering system in various professional fields, but also enhances the flexibility and accuracy of the question-answering system in the face of new fields or cold questions.
Description
Technical Field
The invention relates to the technical field of language processing, in particular to a AIGC large model-based language processing question-answering system and method.
Background
In the current state of the art, significant advances have been made in the field of artificial intelligence and Natural Language Processing (NLP), particularly in terms of language understanding and generation. AIGC (artificial intelligence generation content) technology, particularly large pre-trained language models, have demonstrated great capability in multiple language processing tasks that can understand complex language constructs, contextual meanings, and perform multiple language-based tasks such as text classification, emotion analysis, text summarization, question-answering, and the like.
Nonetheless, existing language processing question-answering systems still face some key challenges. One of these is how to effectively understand and answer those cold questions that relate to a particular area (e.g., medical, legal, or technological, etc.), which often contain terms of art and complex concepts, requiring the system to have in-depth domain knowledge and understanding capabilities. In addition, the prior art also has limitations in terms of the variety of answers generated, naturalness, and user interaction.
Furthermore, despite the extensive knowledge coverage of large models, they are still limited in terms of adaptability and flexibility in particular fields. For example, a generic language model trained on extensive data may have difficulty accurately handling cold terms and problems that only occur in certain specialized areas. Therefore, improving the performance of the model in a specific field, and improving the accuracy, relevance and naturalness of the answer, has become an important point of research and development.
In summary, while existing AIGC techniques and language models achieve significant achievements in processing a wide range of language tasks, improvements in domain-specific questions and answers, answer quality optimization, and user interaction experience remain to be achieved. Therefore, the language processing question-answering method capable of effectively integrating domain knowledge, improving answer generation quality and optimizing user interaction is developed, and has important significance for promoting further development of language processing technology.
Disclosure of Invention
Based on the above purpose, the invention provides a AIGC large model-based language processing question-answering system and method.
A AIGC large model-based language processing question-answering method comprises the following steps:
S1: receiving natural language questions input by a user, and extracting key information through grammar analysis and semantic understanding technology;
S2: inputting the extracted key information into a AIGC-based language model, and generating a series of answer candidates by using a AIGC large model according to the input information and the enhanced domain knowledge through domain adaptability enhancement processing;
S3: evaluating answer candidates to select an optimal answer;
S4: and outputting the optimal answer to the user in the form of natural language.
Further, the S1 specifically includes:
S11, receiving: receiving a natural language question input by a user through a user interface, wherein the user interface supports two modes of text input and voice input;
S12, pretreatment: preprocessing questions entered by the user, including removing irrelevant characters, correcting spelling errors, converting speech input to text (if the first time speech input);
S13, grammar analysis: analyzing the questions by using natural language processing technology, and identifying sentence structures including sentence components of subjects, predicates and objects;
S14, semantic understanding: carrying out semantic analysis on the problem through a deep learning model and a natural language understanding algorithm, and understanding the intention and the contextual meaning of the problem;
S15, extracting key information: based on the results of the grammar analysis and semantic understanding, extracting key information in the problem, wherein the key information comprises:
Key words: the main nouns, verbs, and adjectives in a question refer to words of a particular concept, object, or action;
entity identification: specific entities mentioned in the question include name, place, organization, date;
relationship and attributes: relationships between entities implied in the problem and related attributes and features;
Question type: the type of question is determined based on the structure and wording of the question, including a factual query, an interpretation request, or an operation guide.
Further, the domain adaptability enhancement processing in S2 specifically includes:
s21: determining a specific field to which the problem belongs by using a field recognition algorithm, and extracting a problem and a term library related to the specific field;
S22: matching cold terms and concepts in the problem with nodes in the map through a domain-specific knowledge map constructed by cooperation with a specific domain expert so as to understand the deep meaning and the context relationship of the cold terms and concepts;
s23: the AIGC large model is adjusted in real time by combining the context of the problem and the domain knowledge graph so as to enhance the processing capability of the model on the cold problem and the technical term;
S24: the adjusted question representations and domain knowledge are input as enhancement information into AIGC-based language models in preparation for generating more accurate and specialized answers.
Further, the step S21 specifically includes:
Feature extraction: extracting language features from the user problems, including word frequency, part-of-speech tagging, semantic role tagging and context embedding vectors, wherein the features can comprehensively reflect the language characteristics and deep semantics of the problems;
Domain feature vectorization: converting extracted features into domain feature vectors Wherein each dimension represents a numerical representation of a language feature associated with the domain;
Domain similarity calculation: computing problem feature vectors using domain identification algorithms And a predefined set of domain vectors(Per-domain vector)Feature vectors representing a particular domain), similarity calculation:;
wherein, The dot product of the representative vector is calculated,AndRespectively are vectorsAndEuclidean norms of (c);
the domain determination, namely selecting the domain corresponding to the domain vector with the highest similarity as the specific domain to which the problem belongs;
Term library extraction: according to the determined field, extracting professional questions and term libraries related to the field from a database, wherein the professional questions and term libraries comprise key terms, definitions, common questions and answer information of the field.
Further, the step S22 specifically includes:
constructing a knowledge graph: in cooperation with domain experts, constructing a knowledge graph containing important concepts, terms, entities and interrelationships thereof in the domain, wherein each node represents a concept or entity in the domain, and edges between the nodes represent the relationship between the concepts or entities;
Cold door term identification: analyzing user problems through natural language processing technology, and identifying cold terms and concepts in the problems, wherein the cold terms refer to words with low occurrence frequency in a corpus but specific meaning in specific fields;
Term map mapping: mapping the identified cold terms and concepts with nodes in the knowledge graph, wherein a matching algorithm based on semantic similarity is adopted in the mapping process, and semantic features of the terms and attributes of the graph nodes are considered to determine the best matching node;
contextual relationship resolution: analyzing the context relation between the cold term and the concept in the problem by utilizing edges in the knowledge graph, and revealing the effect and meaning of the cold term in the specific problem by analyzing other nodes connected with the matching nodes and the relation types thereof;
Deep meaning understanding: and analyzing the deep meaning of cold terms and concepts by comprehensively using the structural information of the atlas and the contextual relation of the terms.
Further, the step S23 specifically includes:
S231, context and domain knowledge integration: integrating the context information of the problem and the deep meaning and relation of cold terms and related concepts thereof obtained through the domain knowledge graph into an enhanced feature representation, wherein the enhanced feature representation comprises the original semantic information of the problem and the deep knowledge of the specific domain;
S232, feature conversion: the integrated feature representation is converted into a form suitable for AIGC large models using a self-encoder algorithm, the encoder being represented as: Wherein, the method comprises the steps of, wherein, Is an input feature that is used to determine the input,Is the weight of the encoder and,Is a bias term that is used to determine,Is the function of the activation and,Is the generated hidden layer representation (i.e., encoding); the decoder is expressed as: Wherein, the method comprises the steps of, wherein, Is the weight of the decoder and,Is a bias term that is used to determine,Is the function of the activation and,Is the input for reconstruction, the goal of the self-encoder is to minimize the inputAnd reconstructing the inputThe difference between, using the loss function: training a self-encoder to minimize a loss function, learning a compressed representation of the input data, the compressed representation being used for feature transformation;
S233, model adjustment: based on the converted characteristic representation, parameters of the AIGC large model are adjusted in real time, the adjustment process adopts transfer learning, so that the AIGC large model is suitable for the background and semantic requirements of the specific field of the current problem, and the transfer learning process is as follows:
Pre-training a model on a source task, and learning a representation of source domain data;
migrating a portion of the pre-trained model (e.g., feature extraction layer) to a target task;
Trimming the migrated model portion on the target domain data while maintaining or trimming other portions;
S234, enhanced processing power verification: and whether the processing capacity of the model to the cold door problem and the technical term is obviously enhanced after the model is adjusted is checked through a preset verification mechanism, so that the adjustment effect is ensured to accord with expectations.
Further, the AIGC big model in S2 specifically includes:
enhancement information integration: integrating the adjusted representation of the user question and domain knowledge into an enhanced information set comprising adjusted question features, domain specific terms, concepts and their interrelationships;
context-aware coding: processing the enhanced information set with an encoder to capture complex relationships between deep semantic features of the problem and domain knowledge, the encoder outputting a high-dimensional feature representation of the comprehensive problem context and domain knowledge;
answer generation: the encoded high-dimensional characteristic representation is input into a AIGC large-model decoder, and the decoder generates a series of answer candidates through a sequence generation mechanism on the basis of considering the problem context and the domain knowledge by utilizing the high-dimensional characteristic representation.
Further, in S3, a Beam Search (Beam Search) is used to evaluate answer candidates and to make the generated answers both diversified and highly relevant, the Beam Search specifically includes:
initializing: setting beam width At the beginning of decoding, a size is initialized toEach candidate comprising a partial solution sequence with only a start tag (e.g., < start >);
and (3) iteration expansion: in each iteration, for each partial solution sequence in the bundle, the next vocabulary (or token) and its probability are predicted, and for each partial solution, the highest probability is selected Word, combine with this partial solution, form the new partial solution sequence;
Calculating the score: the score of each newly generated partial solution sequence is calculated by accumulating the logarithmic probabilities of its constituent words as follows:
Wherein, the method comprises the steps of, wherein, Is a partial de-sequence of the sequence,Is the first in the sequenceThe number of words to be used in the method,Is given aboveAnd context(I.e., problem representation and domain knowledge), vocabularyIs a function of the conditional probability of (1),Is the number of words in the sequence;
selecting and reserving: after each iteration, the highest score is selected from all newly generated partial solutions Partial decomposition is added into the beam for the next round of iterative expansion;
termination condition: the iterative process continues until a predefined maximum length is reached, or a partial solution sequence in the bundle ends with an end mark (e.g., < end >);
the highest scoring sequence is selected from the final bundle as the answer candidate, and in the case where multiple answer candidates are required, the top ranking sequence is selected.
Further, the step S4 further includes selecting a formatting scheme, adding context information to the answer, highlighting or emphasizing key information in the answer according to the content and type of the answer, including using bold, italics or color change to attract the user' S attention to the important part.
A AIGC large model-based language processing question-answering system for realizing the above-mentioned language processing question-answering method based on AIGC large model, comprising the following modules:
A user interface module: the module is responsible for receiving natural language questions input by a user and supporting text and speech form input of the questions, and presenting the final answers to the user in a natural, user-friendly manner;
problem understanding module: carrying out grammar analysis and semantic understanding on the problem input by the user by using a natural language processing technology, and extracting key information of the problem, wherein the key information comprises key words, entities, relations and problem types;
Domain adaptability enhancement processing module: the method comprises a domain identification sub-module, a domain knowledge graph matching sub-module and a domain adaptive algorithm sub-module, wherein the domain identification sub-module, the domain knowledge graph matching sub-module and the domain adaptive algorithm sub-module are used for determining the specific domain to which the problem belongs, matching related concepts in the domain knowledge graph and adjusting AIGC large models in real time;
Answer generation module: generating a series of answer candidates according to the context and domain knowledge of the questions by using AIGC large models subjected to domain adaptability enhancement processing, and optimizing an answer generation process by adopting a beam search algorithm;
Answer evaluation and selection module: and comprehensively evaluating answer candidates, including content overlapping measurement, semantic similarity measurement, language fluency check and grammar correctness verification, so as to select an optimal answer.
The invention has the beneficial effects that:
According to the method, through the combination of the field adaptability enhancement processing and the field knowledge graph, the query related to the cold questions and the special terms in the specific field can be accurately understood and answered, and the process not only enhances the understanding of the model on the deep meaning of the questions, but also ensures the accuracy and the high correlation of the answers, so that the requirements of users in the special field are met.
According to the method, through field adaptability enhancement processing, the cold term and the complex concept in the specific field can be deeply understood, the expertise and the accuracy of the answer are ensured, the deep understanding enables the system to process and answer the expertise field problems which are difficult to accurately capture by the traditional language model, and the adaptability and the processing capacity of the AIGC large model to the specific field problems are remarkably enhanced by utilizing the field knowledge graph and the real-time adjustment mechanism. The adaptability reinforcement not only improves the application range of the question-answering system in various professional fields, but also enhances the flexibility and accuracy of the question-answering system in the face of new fields or cold questions.
According to the invention, through the beam search algorithm, the method can select the best answer from a wide range of candidate answers. The selection mechanism performs comprehensive scoring based on the relevance and naturalness of the answers, ensures that the answers finally presented to the user are not only highly relevant to the questions, but also the language expression is smooth and natural, and the beam search algorithm ensures the diversity of the answers by keeping a plurality of optimal candidate solutions in each step. This diversity is particularly important for dealing with open questions with multiple possible answers, and can provide more comprehensive information to meet the needs of different users.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only of the invention and that other drawings can be obtained from them without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method according to an embodiment of the invention;
fig. 2 is a schematic diagram of a system module according to an embodiment of the invention.
Detailed Description
The present invention will be further described in detail with reference to specific embodiments in order to make the objects, technical solutions and advantages of the present invention more apparent.
It is to be noted that unless otherwise defined, technical or scientific terms used herein should be taken in a general sense as understood by one of ordinary skill in the art to which the present invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
As shown in FIG. 1, a method for processing questions and answers in a language based on AIGC big models comprises the following steps:
S1: receiving natural language questions input by a user, and extracting key information through grammar analysis and semantic understanding technology;
S2: inputting the extracted key information into a AIGC-based language model, and generating a series of answer candidates by using a AIGC large model according to the input information and the enhanced domain knowledge through domain adaptability enhancement processing;
S3: evaluating answer candidates to select an optimal answer;
S4: and outputting the optimal answer to the user in the form of natural language.
S1 specifically comprises:
S11, receiving: receiving a natural language question input by a user through a user interface, wherein the user interface supports two modes of text input and voice input;
S12, pretreatment: preprocessing questions entered by the user, including removing irrelevant characters, correcting spelling errors, converting speech input to text (if the first time speech input);
S13, grammar analysis: analyzing the questions by using natural language processing technology, and identifying sentence structures including sentence components of subjects, predicates and objects;
S14, semantic understanding: carrying out semantic analysis on the problem through a deep learning model and a natural language understanding algorithm, and understanding the intention and the contextual meaning of the problem;
S15, extracting key information: based on the results of the grammar analysis and semantic understanding, extracting key information in the problem, wherein the key information comprises:
Key words: the main nouns, verbs, and adjectives in a question refer to words of a particular concept, object, or action;
entity identification: specific entities mentioned in the question include name, place, organization, date;
relationship and attributes: relationships between entities implied in the problem and related attributes and features;
Question type: the type of question is determined based on the structure and wording of the question, including a factual query, an interpretation request, or an operation guide.
The field adaptability enhancing process in S2 specifically includes:
s21: determining a specific field to which the problem belongs by using a field recognition algorithm, and extracting a problem and a term library related to the specific field;
S22: matching cold terms and concepts in the problem with nodes in the map through a domain-specific knowledge map constructed by cooperation with a specific domain expert so as to understand the deep meaning and the context relationship of the cold terms and concepts;
s23: the AIGC large model is adjusted in real time by combining the context of the problem and the domain knowledge graph so as to enhance the processing capability of the model on the cold problem and the technical term;
S24: the adjusted question representations and domain knowledge are input as enhancement information into AIGC-based language models in preparation for generating more accurate and specialized answers.
The enhancement information is input to a AIGC-based language model.
Model input adjustment: integrating the enhancement information encoded representation into the input of the model according to the input requirements of the AIGC language model requires adjusting the input layer of the model to accept the new enhancement information vector as additional input.
Context information integration: in the decoding stage of the model, the enhancement information is used as additional context information to guide the generation of the answer, and the generation is realized by modifying the attention mechanism of the model, so that the context and domain knowledge provided by the enhancement information are considered by the model when the answer is generated.
Training and fine tuning: finally, it is desirable to train or fine tune AIGC the model on the dataset containing the enhancement information to accommodate the new input formats and information, ensuring that the model can effectively utilize the enhancement information to generate more accurate and relevant answers.
S21 specifically comprises:
Feature extraction: extracting language features from the user problems, including word frequency, part-of-speech tagging, semantic role tagging and context embedding vectors, wherein the features can comprehensively reflect the language characteristics and deep semantics of the problems;
Domain feature vectorization: converting extracted features into domain feature vectors Wherein each dimension represents a numerical representation of a language feature associated with the domain;
Domain similarity calculation: computing problem feature vectors using domain identification algorithms And a predefined set of domain vectors(Per-domain vector)Feature vectors representing a particular domain), similarity calculation:;
wherein, The dot product of the representative vector is calculated,AndRespectively are vectorsAndThe Euclidean norm of (2), the formula measures the included angle between the problem feature vector and each field vector in the vector space, and the smaller the included angle is, the higher the similarity is;
the domain determination, namely selecting the domain corresponding to the domain vector with the highest similarity as the specific domain to which the problem belongs;
Term library extraction: according to the determined field, extracting professional questions and term libraries related to the field from a database, wherein the professional questions and term libraries comprise key terms, definitions, common questions and answer information of the field.
S22 specifically comprises the following steps:
Constructing a knowledge graph: in cooperation with domain experts, constructing a knowledge graph containing important concepts, terms, entities and their interrelationships within a domain, each node representing a concept or entity within a domain, and edges between nodes representing the relationship between concepts or entities, such as "is a", "belongs to", "is related to";
Cold door term identification: analyzing user problems through natural language processing technology, and identifying cold terms and concepts in the problems, wherein the cold terms refer to words with low occurrence frequency in a corpus but specific meaning in specific fields;
Term map mapping: mapping the identified cold terms and concepts with nodes in the knowledge graph, wherein a matching algorithm based on semantic similarity is adopted in the mapping process, and semantic features of the terms and attributes of the graph nodes are considered to determine the best matching node;
contextual relationship resolution: analyzing the context relation between the cold term and the concept in the problem by utilizing edges in the knowledge graph, and revealing the effect and meaning of the cold term in the specific problem by analyzing other nodes connected with the matching nodes and the relation types thereof;
Deep meaning understanding: and analyzing the deep meaning of cold terms and concepts by comprehensively using the structural information of the atlas and the contextual relation of the terms, helping to fully understand the questions and providing support for generating accurate and relevant answers.
S23 specifically comprises the following steps:
S231, context and domain knowledge integration: integrating the context information of the problem and the deep meaning and relation of cold terms and related concepts thereof obtained through the domain knowledge graph into an enhanced feature representation, wherein the enhanced feature representation comprises the original semantic information of the problem and the deep knowledge of the specific domain;
S232, feature conversion: the integrated feature representation is converted into a form suitable for AIGC large models using a self-encoder algorithm, which is an unsupervised neural network for efficient encoding of learning data. The basic structure includes an encoder that converts input data into a lower dimensional code from which the decoder attempts to reconstruct the input data, and a decoder that represents: Wherein, the method comprises the steps of, wherein, Is an input feature that is used to determine the input,Is the weight of the encoder and,Is a bias term that is used to determine,Is the function of the activation and,Is the generated hidden layer representation (i.e., encoding); the decoder is expressed as: Wherein, the method comprises the steps of, wherein, Is the weight of the decoder and,Is a bias term that is used to determine,Is the function of the activation and,Is the input for reconstruction, the goal of the self-encoder is to minimize the inputAnd reconstructing the inputThe difference between, using the loss function: training a self-encoder to minimize a loss function, learning a compressed representation of the input data, the compressed representation being used for feature transformation;
S233, model adjustment: based on the converted characteristic representation, parameters of the AIGC large model are adjusted in real time, and the adjustment process adopts transfer learning, so that the AIGC large model is adapted to the specific field background and semantic requirements of the current problem, and the transfer learning is a technology for improving the learning effect on another related task by using knowledge learned on one task. In the migration learning, there is generally one source task and one target task, and a corresponding source domain data set and target domain data set, and the migration learning process is as follows:
Pre-training a model on a source task, and learning a representation of source domain data;
migrating a portion of the pre-trained model (e.g., feature extraction layer) to a target task;
Trimming the migrated model portion on the target domain data while maintaining or trimming other portions;
S234, enhanced processing power verification: and whether the processing capacity of the model to the cold door problem and the technical term is obviously enhanced after the model is adjusted is checked through a preset verification mechanism, so that the adjustment effect is ensured to accord with expectations.
The verification mechanism employs cross-validation or simulated problem testing.
The AIGC big model in S2 specifically includes:
enhancement information integration: integrating the adjusted representation of the user question and domain knowledge into an enhanced information set comprising adjusted question features, domain specific terms, concepts and their interrelationships;
context-aware coding: processing the enhanced information set with an encoder to capture complex relationships between deep semantic features of the problem and domain knowledge, the encoder outputting a high-dimensional feature representation of the comprehensive problem context and domain knowledge;
Answer generation: the encoded high-dimensional feature representation is input into a AIGC large model decoder, the decoder generates a series of answer candidates through a sequence generation mechanism based on consideration of the problem context and domain knowledge by using the high-dimensional feature representation, and the decoder can generate an answer sequence through a self-attention and cross-attention mechanism based on the structure of a transducer.
In S3, evaluating answer candidates by using a Beam Search (Beam Search), and diversifying and highly correlating generated answers to help avoid generating highly repeated answers while ensuring quality and correlation of the answers, and performing post-processing and optimization on the generated answer candidates, including grammar correction, semantic consistency check, and domain knowledge verification, to improve accuracy and expertise of the answers, the Beam Search specifically includes:
initializing: setting beam width At the beginning of decoding, a size is initialized toEach candidate comprising a partial solution sequence with only a start tag (e.g., < start >);
and (3) iteration expansion: in each iteration, for each partial solution sequence in the bundle, the next vocabulary (or token) and its probability are predicted, and for each partial solution, the highest probability is selected Word, combine with this partial solution, form the new partial solution sequence;
Calculating the score: the score of each newly generated partial solution sequence is calculated by accumulating the logarithmic probabilities of its constituent words as follows:
Wherein, the method comprises the steps of, wherein, Is a partial de-sequence of the sequence,Is the first in the sequenceThe number of words to be used in the method,Is given aboveAnd context(I.e., problem representation and domain knowledge), vocabularyIs a function of the conditional probability of (1),Is the number of words in the sequence;
selecting and reserving: after each iteration, the highest score is selected from all newly generated partial solutions Partial decomposition is added into the beam for the next round of iterative expansion;
termination condition: the iterative process continues until a predefined maximum length is reached, or a partial solution sequence in the bundle ends with an end mark (e.g., < end >);
the highest scoring sequence is selected from the final bundle as the answer candidate, and in the case where multiple answer candidates are required, the top ranking sequence is selected.
S4 further includes selecting a formatting scheme according to the content and type of the answer, for example, if the answer is a list (e.g., step, option, etc.), then it is presented in a list form; if the answer contains date, number or specific data, it is ensured that the format of this information is standardized and easy to read;
adding context information to the answer, allowing the user to understand the answer even without seeing a complete question-answer history, which may include brief question repetition, introducing background information of the answer, or interpreting specific terms;
Highlighting or emphasizing key information in the answer includes using bolded, italic, or color changes to draw the user's attention to important parts.
As shown in fig. 2, a language processing question-answering system based on AIGC big models is used for implementing the above-mentioned language processing question-answering method based on AIGC big models, and includes the following modules:
A user interface module: the module is responsible for receiving natural language questions input by a user and supporting text and speech form input of the questions, and presenting the final answers to the user in a natural, user-friendly manner;
problem understanding module: carrying out grammar analysis and semantic understanding on the problem input by the user by using a natural language processing technology, and extracting key information of the problem, wherein the key information comprises key words, entities, relations and problem types;
Domain adaptability enhancement processing module: the method comprises a domain identification sub-module, a domain knowledge graph matching sub-module and a domain adaptive algorithm sub-module, wherein the domain identification sub-module, the domain knowledge graph matching sub-module and the domain adaptive algorithm sub-module are used for determining the specific domain to which the problem belongs, matching related concepts in the domain knowledge graph and adjusting AIGC large models in real time;
Answer generation module: generating a series of answer candidates according to the context and domain knowledge of the questions by using AIGC large models subjected to domain adaptability enhancement processing, and optimizing an answer generation process by adopting a beam search algorithm;
Answer evaluation and selection module: and comprehensively evaluating answer candidates, including content overlapping measurement, semantic similarity measurement, language fluency check and grammar correctness verification, so as to select an optimal answer.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the invention is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the invention, the steps may be implemented in any order and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.
The present invention is intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the present invention should be included in the scope of the present invention.
Claims (4)
1. A language processing question-answering method based on AIGC big models is characterized by comprising the following steps:
S1: receiving natural language questions input by a user, and extracting key information through grammar analysis and semantic understanding technology;
S2: inputting the extracted key information into a AIGC-based language model, and generating a series of answer candidates by using a AIGC large model according to the input information and enhanced domain knowledge through domain adaptability enhancement processing, wherein the domain adaptability enhancement processing specifically comprises the following steps:
s21: determining a specific field to which the problem belongs by using a field recognition algorithm, and extracting a problem and a term library related to the specific field;
S22: matching cold terms and concepts in the problem with nodes in the map through a domain-specific knowledge map constructed by cooperation with a specific domain expert so as to understand the deep meaning and the context relationship of the cold terms and concepts;
s23: the AIGC large model is adjusted in real time by combining the context of the problem and the domain knowledge graph so as to enhance the processing capability of the model on the cold problem and the technical term;
s24: inputting the adjusted question representation and domain knowledge as enhancement information into a AIGC-based language model to prepare for generating more accurate and specialized answers;
The step S21 specifically comprises the following steps:
Feature extraction: extracting language features from the user problems, including word frequency, part-of-speech tagging, semantic role tagging and context embedding vectors;
Domain feature vectorization: converting extracted features into domain feature vectors Wherein each dimension represents a numerical representation of a language feature associated with the domain;
Domain similarity calculation: computing problem feature vectors using domain identification algorithms And a predefined set of domain vectorsSimilarity between the two, similarity calculation:;
wherein, The dot product of the representative vector is calculated,AndRespectively are vectorsAndEuclidean norms of (c);
the domain determination, namely selecting the domain corresponding to the domain vector with the highest similarity as the specific domain to which the problem belongs;
term library extraction: extracting professional questions and a term library related to the field from a database according to the determined field, wherein the professional questions and the term library comprise key terms, definitions, common questions and answer information thereof in the field;
the step S22 specifically includes:
constructing a knowledge graph: in cooperation with domain experts, constructing a knowledge graph containing important concepts, terms, entities and interrelationships thereof in the domain, wherein each node represents a concept or entity in the domain, and edges between the nodes represent the relationship between the concepts or entities;
Cold door term identification: analyzing user problems through natural language processing technology, and identifying cold terms and concepts in the problems, wherein the cold terms refer to words with low occurrence frequency in a corpus but specific meaning in specific fields;
Term map mapping: mapping the identified cold terms and concepts with nodes in the knowledge graph, wherein a matching algorithm based on semantic similarity is adopted in the mapping process, and semantic features of the terms and attributes of the graph nodes are considered to determine the best matching node;
contextual relationship resolution: analyzing the context relation between the cold term and the concept in the problem by utilizing edges in the knowledge graph, and revealing the effect and meaning of the cold term in the specific problem by analyzing other nodes connected with the matching nodes and the relation types thereof;
deep meaning understanding: analyzing the deep meaning of cold terms and concepts by comprehensively using the structural information of the atlas and the context relation of the terms;
The step S23 specifically comprises the following steps:
S231, context and domain knowledge integration: integrating the context information of the problem and the deep meaning and relation of cold terms and related concepts thereof obtained through the domain knowledge graph into an enhanced feature representation, wherein the enhanced feature representation comprises the original semantic information of the problem and the deep knowledge of the specific domain;
S232, feature conversion: the integrated feature representation is converted into a form suitable for AIGC large models using a self-encoder algorithm, the encoder being represented as: Wherein, the method comprises the steps of, wherein, Is an input feature that is used to determine the input,Is the weight of the encoder and,Is a bias term that is used to determine,Is the function of the activation and,Is the generated hidden layer representation; the decoder is expressed as: Wherein, the method comprises the steps of, wherein, Is the weight of the decoder and,Is a bias term that is used to determine,Is the function of the activation and,Is the input for reconstruction, the goal of the self-encoder is to minimize the inputAnd reconstructing the inputThe difference between, using the loss function: training a self-encoder to minimize a loss function, learning a compressed representation of the input data, the compressed representation being used for feature transformation;
S233, model adjustment: based on the converted characteristic representation, parameters of the AIGC large model are adjusted in real time, the adjustment process adopts transfer learning, so that the AIGC large model is suitable for the background and semantic requirements of the specific field of the current problem, and the transfer learning process is as follows:
Pre-training a model on a source task, and learning a representation of source domain data;
migrating a portion of the pre-trained model to a target task;
Trimming the migrated model portion on the target domain data while maintaining or trimming other portions;
S234, enhanced processing power verification: whether the processing capacity of the model to the cold door problem and the technical term is obviously enhanced after the model is adjusted is checked through a preset verification mechanism, so that the adjustment effect is ensured to accord with expectations;
The AIGC big model in S2 specifically includes:
enhancement information integration: integrating the adjusted representation of the user question and domain knowledge into an enhanced information set comprising adjusted question features, domain specific terms, concepts and their interrelationships;
context-aware coding: processing the enhanced information set with an encoder to capture complex relationships between deep semantic features of the problem and domain knowledge, the encoder outputting a high-dimensional feature representation of the comprehensive problem context and domain knowledge;
Answer generation: inputting the encoded high-dimensional characteristic representation into a AIGC large-model decoder, and generating a series of answer candidates by using a sequence generation mechanism by using the high-dimensional characteristic representation on the basis of considering the problem context and the domain knowledge;
S3: evaluating answer candidates to select an optimal answer, evaluating the answer candidates using a bundle search, which specifically includes:
initializing: setting beam width At the beginning of decoding, a size is initialized toEach candidate comprising a partial solution sequence with only a start tag;
And (3) iteration expansion: in each iteration, for each partial solution sequence in the bundle, the next vocabulary and its probability are predicted, and for each partial solution, the highest probability is selected Word, combine with this partial solution, form the new partial solution sequence;
Calculating the score: the score of each newly generated partial solution sequence is calculated by accumulating the logarithmic probabilities of its constituent words as follows:
Wherein, the method comprises the steps of, wherein, Is a partial de-sequence of the sequence,Is the first in the sequenceThe number of words to be used in the method,Is given aboveAnd contextWhen the words areIs a function of the conditional probability of (1),Is the number of words in the sequence;
selecting and reserving: after each iteration, the highest score is selected from all newly generated partial solutions Partial decomposition is added into the beam for the next round of iterative expansion;
termination condition: the iterative process continues until a predefined maximum length is reached, or a partial solution sequence in the bundle ends with an end-marker;
selecting the sequence with the highest score from the final bundle as an answer candidate, and selecting the sequence ranked at the front when a plurality of answer candidates are needed;
S4: and outputting the optimal answer to the user in the form of natural language.
2. The method for processing questions and answers in a language based on AIGC big models of claim 1, wherein S1 specifically comprises:
S11, receiving: receiving a natural language question input by a user through a user interface, wherein the user interface supports two modes of text input and voice input;
S12, pretreatment: preprocessing the problem input by the user, including removing irrelevant characters, correcting spelling errors, and converting voice input into text;
S13, grammar analysis: analyzing the questions by using natural language processing technology, and identifying sentence structures including sentence components of subjects, predicates and objects;
S14, semantic understanding: carrying out semantic analysis on the problem through a deep learning model and a natural language understanding algorithm, and understanding the intention and the contextual meaning of the problem;
S15, extracting key information: based on the results of the grammar analysis and semantic understanding, extracting key information in the problem, wherein the key information comprises:
Key words: the main nouns, verbs, and adjectives in a question refer to words of a particular concept, object, or action;
entity identification: specific entities mentioned in the question include name, place, organization, date;
relationship and attributes: relationships between entities implied in the problem and related attributes and features;
Question type: the type of question is determined based on the structure and wording of the question, including a factual query, an interpretation request, or an operation guide.
3. A language processing question-answering method based on a AIGC large model according to claim 1, in which S4 further comprises selecting a formatting scheme, adding context information to the answer, and highlighting or emphasizing key information in the answer, including using bold, italics, or color change to draw the user' S attention to important parts, according to the content and type of answer.
4. A language processing question-answering system based on AIGC big models for implementing a language processing question-answering method based on AIGC big models according to any one of claims 1-3, comprising the following modules:
a user interface module: the method is responsible for receiving natural language questions input by a user and supporting text and voice form input of the questions;
problem understanding module: carrying out grammar analysis and semantic understanding on the problem input by the user by using a natural language processing technology, and extracting key information of the problem, wherein the key information comprises key words, entities, relations and problem types;
Domain adaptability enhancement processing module: the method comprises a domain identification sub-module, a domain knowledge graph matching sub-module and a domain adaptive algorithm sub-module, wherein the domain identification sub-module, the domain knowledge graph matching sub-module and the domain adaptive algorithm sub-module are used for determining the specific domain to which the problem belongs, matching related concepts in the domain knowledge graph and adjusting AIGC large models in real time;
Answer generation module: generating a series of answer candidates according to the context and domain knowledge of the questions by using AIGC large models subjected to domain adaptability enhancement processing, and optimizing an answer generation process by adopting a beam search algorithm;
Answer evaluation and selection module: and comprehensively evaluating answer candidates, including content overlapping measurement, semantic similarity measurement, language fluency check and grammar correctness verification, so as to select an optimal answer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410479542.5A CN118093834B (en) | 2024-04-22 | 2024-04-22 | AIGC large model-based language processing question-answering system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410479542.5A CN118093834B (en) | 2024-04-22 | 2024-04-22 | AIGC large model-based language processing question-answering system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118093834A CN118093834A (en) | 2024-05-28 |
CN118093834B true CN118093834B (en) | 2024-08-02 |
Family
ID=91155253
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410479542.5A Active CN118093834B (en) | 2024-04-22 | 2024-04-22 | AIGC large model-based language processing question-answering system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118093834B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118503394B (en) * | 2024-07-17 | 2024-10-01 | 山东浪潮科学研究院有限公司 | Self-adaptive decision method, system and storage medium based on large language model |
CN118643802B (en) * | 2024-08-13 | 2024-10-11 | 北京中数睿智科技有限公司 | AI customer service synthesis information reliability evaluation method based on communication system large model |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116932723A (en) * | 2023-07-28 | 2023-10-24 | 世优(北京)科技有限公司 | Man-machine interaction system and method based on natural language processing |
CN117556002A (en) * | 2023-11-03 | 2024-02-13 | 山东浪潮科学研究院有限公司 | Multi-round dialogue training method for large dialogue model |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110516229B (en) * | 2019-07-10 | 2020-05-05 | 杭州电子科技大学 | Domain-adaptive Chinese word segmentation method based on deep learning |
CN117055724B (en) * | 2023-05-08 | 2024-05-28 | 华中师范大学 | Working method of generating teaching resource system in virtual teaching scene |
CN116822625A (en) * | 2023-05-17 | 2023-09-29 | 广西卓洁电力工程检修有限公司 | Divergent-type associated fan equipment operation and detection knowledge graph construction and retrieval method |
CN116881426B (en) * | 2023-08-30 | 2023-11-10 | 环球数科集团有限公司 | AIGC-based self-explanatory question-answering system |
CN117235216A (en) * | 2023-08-30 | 2023-12-15 | 电子科技大学 | Knowledge reasoning method based on heterogeneous knowledge fusion |
CN117171333B (en) * | 2023-11-03 | 2024-08-02 | 国网浙江省电力有限公司营销服务中心 | Electric power file question-answering type intelligent retrieval method and system |
CN117521675A (en) * | 2023-11-06 | 2024-02-06 | 腾讯科技(深圳)有限公司 | Information processing method, device, equipment and storage medium based on large language model |
CN117708277B (en) * | 2023-11-10 | 2024-10-01 | 广州宝露软件开发有限公司 | AIGC-based question and answer system and application method |
-
2024
- 2024-04-22 CN CN202410479542.5A patent/CN118093834B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116932723A (en) * | 2023-07-28 | 2023-10-24 | 世优(北京)科技有限公司 | Man-machine interaction system and method based on natural language processing |
CN117556002A (en) * | 2023-11-03 | 2024-02-13 | 山东浪潮科学研究院有限公司 | Multi-round dialogue training method for large dialogue model |
Also Published As
Publication number | Publication date |
---|---|
CN118093834A (en) | 2024-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113239700A (en) | Text semantic matching device, system, method and storage medium for improving BERT | |
CN111831789B (en) | Question-answering text matching method based on multi-layer semantic feature extraction structure | |
CN108932342A (en) | A kind of method of semantic matches, the learning method of model and server | |
CN118093834B (en) | AIGC large model-based language processing question-answering system and method | |
CN113239169B (en) | Answer generation method, device, equipment and storage medium based on artificial intelligence | |
CN110096567A (en) | Selection method, system are replied in more wheels dialogue based on QA Analysis of Knowledge Bases Reasoning | |
CN116127095A (en) | Question-answering method combining sequence model and knowledge graph | |
CN117648429B (en) | Question-answering method and system based on multi-mode self-adaptive search type enhanced large model | |
CN111666376B (en) | Answer generation method and device based on paragraph boundary scan prediction and word shift distance cluster matching | |
CN115204143B (en) | Method and system for calculating text similarity based on prompt | |
CN111125333A (en) | Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism | |
CN117609421A (en) | Electric power professional knowledge intelligent question-answering system construction method based on large language model | |
CN115759254A (en) | Question-answering method, system and medium based on knowledge-enhanced generative language model | |
CN112307179A (en) | Text matching method, device, equipment and storage medium | |
CN112349294A (en) | Voice processing method and device, computer readable medium and electronic equipment | |
CN117494815A (en) | File-oriented credible large language model training and reasoning method and device | |
CN118228694A (en) | Method and system for realizing industrial industry number intelligence based on artificial intelligence | |
CN112989803B (en) | Entity link prediction method based on topic vector learning | |
CN117932066A (en) | Pre-training-based 'extraction-generation' answer generation model and method | |
CN111581365B (en) | Predicate extraction method | |
CN113705207A (en) | Grammar error recognition method and device | |
Alwaneen et al. | Stacked dynamic memory-coattention network for answering why-questions in Arabic | |
CN114417880B (en) | Interactive intelligent question-answering method based on power grid practical training question-answering knowledge base | |
CN114239575B (en) | Statement analysis model construction method, statement analysis method, device, medium and computing equipment | |
CN113792550B (en) | Method and device for determining predicted answers, reading and understanding method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |