CN116204616A - Artificial intelligence question-answering method based on semantic training algorithm - Google Patents

Artificial intelligence question-answering method based on semantic training algorithm Download PDF

Info

Publication number
CN116204616A
CN116204616A CN202211711312.4A CN202211711312A CN116204616A CN 116204616 A CN116204616 A CN 116204616A CN 202211711312 A CN202211711312 A CN 202211711312A CN 116204616 A CN116204616 A CN 116204616A
Authority
CN
China
Prior art keywords
semantic
training
artificial intelligence
semantic training
method based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211711312.4A
Other languages
Chinese (zh)
Inventor
徐杭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Baijue Technology Co ltd
Original Assignee
Nanjing Baijue Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Baijue Technology Co ltd filed Critical Nanjing Baijue Technology Co ltd
Priority to CN202211711312.4A priority Critical patent/CN116204616A/en
Publication of CN116204616A publication Critical patent/CN116204616A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/245Classification techniques relating to the decision surface
    • G06F18/2453Classification techniques relating to the decision surface non-linear, e.g. polynomial classifier
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Nonlinear Science (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to an artificial intelligence technology, in particular to an artificial intelligence question-answering method based on a semantic training algorithm, which comprises the following steps: s1.1: collecting corpus data; s1.2: carrying out data preprocessing on the collected corpus data; s1.3: constructing a semantic training model based on the collected corpus data; s1.4: and carrying out model optimization training on the constructed semantic training model. According to the invention, the acquired corpus data is subjected to desensitization processing, so that the leakage of sensitive data is prevented, the safety of artificial intelligence question and answer is ensured, mask training and sequential logic training are respectively carried out by constructing a semantic training model, the learning capacity of the semantic training model is improved, the expansibility of an artificial intelligence question and answer system is improved, the accuracy and the recovery rate of the artificial intelligence question and answer are improved, the question can be recovered more efficiently, and the method can be applied to various intelligent fields.

Description

Artificial intelligence question-answering method based on semantic training algorithm
Technical Field
The invention relates to an artificial intelligence technology, in particular to an artificial intelligence question-answering method based on a semantic training algorithm.
Background
Natural language processing is a scientific research for processing media such as language, characters and the like used for information transmission by human beings, the English name of the natural language processing is Natural Language Processing, NLP is an important field in the current very popular artificial intelligence, a dialogue system is an important research direction in NLP, and the main research is how to enable a computer to have intelligent thinking for interaction with human beings, so that the natural language processing is an important work of the artificial intelligence and is a very challenging task. In 1950, turing published an evaluation method for computer systems through "computer and Intelligence", named "Turing test", which provided clear target direction for computer intelligence never before, namely represented the level of computer intelligence by means of man-machine conversation, and brought about a great deal of attention for students. The current dialogue system mainly comprises a question and answer system which helps users answer questions; the task-oriented dialog system mainly provides corresponding operation prompts and the like for users under the appointed scene tasks; the question-answering system is mainly a large-scale system for providing knowledge inquiry for users by processing questions input by users, analyzing the key contents of the questions according to the input, and searching and generating answers among candidate existing questions-answers.
The traditional task oriented system mainly comprises the steps of obtaining the intention input from a user side, carrying out a series of vectorization processing on the intention by utilizing the existing text processing methods, converting input information into vector state representation which can be recognized by a machine system, obtaining a matching value of a relevant answer by a series of calculation on the vector information, selecting the existing candidate answer according to a set matching strategy and the calculated matching value, and obtaining and returning the corresponding answer to the user. The system regards the dialogue as a pipeline, the mode mostly needs to manually label the characteristics of semantic representation in the dialogue process, thus a large amount of manpower and material resources are consumed, the cost is high, and the system is a task-oriented system, can only complete the work under a certain scene set for the system, cannot be applied to other fields, and has poor expansibility.
Disclosure of Invention
The invention aims to solve the defects in the background technology by providing an artificial intelligence question-answering method based on a semantic training algorithm.
The technical scheme adopted by the invention is as follows:
the artificial intelligence question-answering method based on the semantic training algorithm comprises the following steps:
s1.1: collecting corpus data;
s1.2: carrying out data preprocessing on the collected corpus data;
s1.3: constructing a semantic training model based on the collected corpus data;
s1.4: and carrying out model optimization training on the constructed semantic training model.
As a preferred technical scheme of the invention: in the step S1.2, error detection correction processing and desensitization processing are carried out on the collected corpus data.
As a preferred technical scheme of the invention: in the step S1.3, the input corpus data is defined, and the definition input is that
Figure BDA0004027565060000021
Where m represents the total turn of the dialog, i.e. [1, m]Represents the ith round of dialog, u i Representing the speech replied to the ith round, defining +.>
Figure BDA0004027565060000022
n i Is the length of the utterance returned by the ith round,/-for the ith round>
Figure BDA0004027565060000023
J's character representing the speech replied to in the ith pass of speech, j's [1, n ] i ]。
As a preferred technical scheme of the invention: in the S1.3, the input of the semantic training model comprises token coding, segment coding and position coding, and the token coding, the segment coding and the position coding are carried outAdding as input, wherein token is encoded as
Figure BDA0004027565060000024
The corresponding character embedding table is E t Representing the vocabulary size, the fragment coding is +.>
Figure BDA0004027565060000025
The corresponding fragment embedding table is E s S represents the maximum number of fragments, the position coding is +.>
Figure BDA0004027565060000026
The corresponding position embedding table is +.>
Figure BDA0004027565060000027
N represents the sequence length of the entire dialog, i.e. +.>
Figure BDA0004027565060000028
Where m represents the total turn of the dialog, i.e. [1, m]Represents the ith round, n, of the dialog i Is the length of the utterance recovered by the ith round,
the total input of the semantic training model is obtained as follows:
Figure BDA0004027565060000029
wherein ,eij The method comprises the steps of inputting a semantic training model, wherein each position corresponds to an embedded vector; i epsilon [1, m],j∈[1,n i ]The method comprises the steps of carrying out a first treatment on the surface of the Extracting to obtain the following components:
E ij =transformer(e ij )
wherein ,Eij Representing the output vector for each character of the sequence.
As a preferred technical scheme of the invention: and identifying through a nonlinear classifier by the embedded vector corresponding to each position, and judging whether the embedded vectors at other positions have masks and actual texts corresponding to the masks through mask training.
As a preferred technical scheme of the invention: the mask training is as follows:
for the recovered utterance
Figure BDA00040275650600000210
Predicting characters by a nonlinear character classifier, the formula is as follows:
Figure BDA00040275650600000211
wherein ,
Figure BDA00040275650600000212
representing +.>
Figure BDA00040275650600000213
Predicted value of E tT Transpose of the character-embedded table, b 1 Is a bias parameter of the nonlinear classifier.
As a preferred technical scheme of the invention: in the mask training process, a loss function L of mask training 1 (θ,θ 1 ) Expressed as:
Figure BDA0004027565060000031
wherein θ is encoder parameters of the semantic training model, θ 1 For the non-line character classifier parameter, the number of characters of the input sequence mask is M, m=m' represents the number of mask characters, and V represents the vocabulary size.
As a preferred technical scheme of the invention: for the speech u replied to the ith round i Sequential logic training is performed by first embedding vector E of the ith segment i1 Predicting the turn of the segment in the dialog
Figure BDA0004027565060000032
And compared with the actual run:
Figure BDA0004027565060000033
wherein ,
Figure BDA0004027565060000034
for the prediction round of the ith fragment, W 2 Embedding a unit vector of a table for a fragment, b 2 Is a bias parameter of the nonlinear classifier.
As a preferred technical scheme of the invention: in the sequential logic training, a loss function L of the sequential logic training 2 (θ,θ 2 ) Expressed as:
Figure BDA0004027565060000035
wherein θ is encoder parameters of the semantic training model, θ 2 Is a parameter of a nonlinear round classifier,
Figure BDA0004027565060000036
for the prediction round of the ith segment, e=e' represents the number of segments of the dialog and S represents the maximum number of segments.
As a preferred technical scheme of the invention: in the step S1.4, a total loss function L of the semantic training model is obtained:
L=L 1 (θ,θ 1 )+L 2 (θ,θ 2 )
training the semantic training model with the minimum loss function as a target, wherein θ is the encoder parameter of the semantic training model, and θ is 1 Is a non-linear character, θ 2 Parameter classifier parameters, L, which are nonlinear round classifiers 1 (θ,θ 1 ) Loss function trained for mask, L 2 (θ,θ 2 ) A penalty function trained for sequential logic.
Compared with the prior art, the artificial intelligence question-answering method based on the semantic training algorithm has the beneficial effects that:
according to the invention, the acquired corpus data is subjected to desensitization processing, so that the leakage of sensitive data is prevented, the safety of artificial intelligence question and answer is ensured, mask training and sequential logic training are respectively carried out by constructing a semantic training model, the learning capacity of the semantic training model is improved, the expansibility of an artificial intelligence question and answer system is improved, the accuracy and the recovery rate of the artificial intelligence question and answer are improved, the question can be recovered more efficiently, and the method can be applied to various intelligent fields.
Drawings
FIG. 1 is a flow chart of a method of a preferred embodiment of the present invention.
Detailed Description
It should be noted that, under the condition of no conflict, the embodiments of the present embodiments and features in the embodiments may be combined with each other, and the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and obviously, the described embodiments are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a preferred embodiment of the present invention provides an artificial intelligence question-answering method based on a semantic training algorithm, comprising the steps of:
s1.1: collecting corpus data;
s1.2: carrying out data preprocessing on the collected corpus data;
s1.3: constructing a semantic training model based on the collected corpus data;
s1.4: and carrying out model optimization training on the constructed semantic training model.
In the step S1.2, error detection correction processing and desensitization processing are carried out on the collected corpus data.
In the step S1.3, the input corpus data is defined, and the definition input is that
Figure BDA0004027565060000041
Where m represents the total turn of the dialog, i.e. [1, m]Represents the ith round of dialog, u i Representing the speech replied to the ith round, defining +.>
Figure BDA0004027565060000042
n i Is the length of the reply utterance of the ith round,/-for the ith round>
Figure BDA0004027565060000043
J's character representing the speech replied to in the ith pass of speech, j's [1, n ] i ]。
In the S1.3, the input of the semantic training model comprises token coding, segment coding and position coding, and the addition of the token coding, the segment coding and the position coding is used as the input, wherein the token coding is as follows
Figure BDA0004027565060000044
The corresponding character embedding table is E t Representing the fragment encoded +.>
Figure BDA0004027565060000045
The corresponding fragment embedding table is E s S represents the maximum number of fragments, the position coding is +.>
Figure BDA0004027565060000046
The corresponding position embedding table is +.>
Figure BDA0004027565060000047
N represents the sequence length of the entire dialog, i.e. +.>
Figure BDA0004027565060000048
Where m represents the total turn of the dialog, i.e. [1, m]Represents the ith round, n, of the dialog i Is the length of the utterance recovered by the ith round,
the total input of the semantic training model is obtained as follows:
Figure BDA0004027565060000049
wherein ,eij Training for semanticsTraining the input of a model, and embedding vectors corresponding to each position; i epsilon [1, m],j∈[1,n i ]The method comprises the steps of carrying out a first treatment on the surface of the The method comprises the steps of carrying out a first treatment on the surface of the Extracting to obtain the following components:
E ij =transformer(e ij )
wherein ,Eij Representing the output vector for each character of the sequence.
And identifying through a nonlinear classifier by the embedded vector corresponding to each position, and judging whether the embedded vectors at other positions have masks and actual texts corresponding to the masks through mask training.
The mask training is as follows:
for the recovered utterance
Figure BDA0004027565060000051
Predicting characters by a nonlinear character classifier, the formula is as follows:
Figure BDA0004027565060000052
wherein ,
Figure BDA0004027565060000053
representing +.>
Figure BDA0004027565060000054
Predicted value of E tT Transpose of the character-embedded table, b 1 Is a bias parameter of the nonlinear classifier.
In the mask training process, a loss function L of mask training 1 (θ,θ 1 ) Expressed as:
Figure BDA0004027565060000055
wherein θ is encoder parameters of the semantic training model, θ 1 For the non-line character classifier parameter, the number of characters of the input sequence mask is M, m=m' represents the number of mask characters, and V represents the vocabulary size.
For the speech u replied to the ith round i Sequential logic training is performed by first embedding vector E of the ith segment i1 Predicting the turn of the segment in the dialog
Figure BDA0004027565060000056
And compared with the actual run:
Figure BDA0004027565060000057
wherein ,
Figure BDA0004027565060000058
for the prediction round of the ith fragment, W 2 Embedding a unit vector of a table for a fragment, b 2 Is a bias parameter of the nonlinear classifier.
In the sequential logic training, a loss function L of the sequential logic training 2 (θ,θ 2 ) Expressed as:
Figure BDA0004027565060000059
wherein θ is encoder parameters of the semantic training model, θ 2 Is a parameter of a nonlinear round classifier,
Figure BDA00040275650600000510
for the prediction round of the ith segment, e=e' represents the number of segments of the dialog and S represents the maximum number of segments.
In the step S1.4, a total loss function L of the semantic training model is obtained:
L=L 1 (θ,θ 1 )+L 2 (θ,θ 2 )
training the semantic training model with the minimum loss function as a target, wherein θ is the encoder parameter of the semantic training model, and θ is 1 Is a non-linear character, θ 2 Parameter classifier parameters, L, which are nonlinear round classifiers 1 (θ,θ 1 ) Loss function trained for mask, L 2 (θ,θ 2 ) A penalty function trained for sequential logic.
In this embodiment, corpus data is collected, error detection and correction are performed based on the collected corpus data, and desensitization processing is performed, so that digits, such as an identification card number, a collection number, a bank card number, an address house number and the like, and sensitive information, such as names and addresses and the like, appearing in the corpus data can be replaced randomly in the desensitization processing to prevent information leakage.
Defining the input corpus data, wherein the definition input is that
Figure BDA0004027565060000061
Where m represents the total turn of the dialog, i.e. [1, m]Represents the ith round of dialog, u i Representing the speech replied to the ith round, defining +.>
Figure BDA0004027565060000062
n i Is the length of the utterance returned by the ith round,/-for the ith round>
Figure BDA0004027565060000063
J's character representing the speech replied to in the ith pass of speech, j's [1, n ] i ]。
The input of the semantic training model comprises token coding, segment coding and position coding, and the token coding is added with the segment coding and the position coding as input, wherein the token coding is as follows
Figure BDA0004027565060000064
The corresponding character embedding table is E t V represents the vocabulary size, fragment encoded as +.>
Figure BDA0004027565060000065
The corresponding fragment embedding table is E s S represents the maximum number of fragments, the position coding is +.>
Figure BDA0004027565060000066
Corresponding position is embeddedGo into the table to be +.>
Figure BDA0004027565060000067
N represents the sequence length of the entire dialog, i.e. +.>
Figure BDA0004027565060000068
Where m represents the total turn of the dialog, i.e. [1, m]Represents the ith round, n, of the dialog i Is the length of the utterance recovered by the ith round,
the total input of the semantic training model is obtained as follows:
Figure BDA0004027565060000069
wherein ,eij The method comprises the steps of inputting a semantic training model, wherein each position corresponds to an embedded vector; i epsilon [1, m],j∈[1,n i ]The method comprises the steps of carrying out a first treatment on the surface of the Extracting to obtain the following components:
E ij =transformer(e ij )
wherein ,Eij Representing the output vector for each character of the sequence.
And identifying through a nonlinear classifier by using the embedded vector corresponding to each position, and judging whether masks and actual texts corresponding to the masks exist in the embedded vectors at other positions through mask training.
For the recovered utterance
Figure BDA00040275650600000610
If the 2 nd and 3 rd characters are selected as masks, the 2 nd and 3 rd characters are masked to obtain: />
Figure BDA00040275650600000611
Predicting characters by a nonlinear character classifier, the formula is as follows:
Figure BDA00040275650600000612
wherein ,
Figure BDA00040275650600000613
representing +.>
Figure BDA00040275650600000614
Predicted value of E tT Transpose of the character-embedded table, b 1 Is a bias parameter of the nonlinear classifier. />
Through mask learning on the characters, the learning capacity of the semantic training model is improved, and the answer efficiency of the semantic training model is improved.
The interactive information has sequential logic questions, if the answer question is likely to be a reply of a sentence in the previous interactive information, the speech u replied by the ith round is needed i Sequential logic training is performed by first embedding vector E of the ith segment i1 Predicting the turn of the segment in the dialog
Figure BDA0004027565060000071
And compared with the actual run:
Figure BDA0004027565060000072
wherein ,
Figure BDA0004027565060000073
for the prediction round of the ith fragment, W 2 Embedding a unit vector of a table for a fragment, b 2 Is a bias parameter of the nonlinear classifier.
By sequential logic training of the reply utterances, the accuracy and the recovery rate of the reply are improved.
According to two training processes, loss functions are obtained respectively:
Figure BDA0004027565060000074
wherein θ is encoder parameters of the semantic training model, θ 1 Is a non-linear wordAnd (3) a character classifier parameter, wherein the number of characters of an input sequence mask is M, m=m' represents the number of mask characters, and V represents the vocabulary size.
Figure BDA0004027565060000075
Wherein θ is encoder parameters of the semantic training model, θ 2 Is a parameter of a nonlinear round classifier,
Figure BDA0004027565060000076
for the prediction round of the ith segment, e=e' represents the number of segments of the dialog and S represents the maximum number of segments.
Obtaining a total loss function of the semantic training model:
L=L 1 (θ,θ 1 )+L 2 (θ,θ 2 )
training the semantic training model with the minimum loss function as a target, wherein θ is the encoder parameter of the semantic training model, and θ is 1 Is a non-linear character, θ 2 Parameter classifier parameters, L, for a nonlinear round classifier 1 (θ,θ 1 ) Loss function trained for mask, L 2 (θ,θ 2 ) A penalty function trained for sequential logic. So as to improve the accuracy of the model and reduce the error of the model.
The model can be further trained by expanding the collected corpus data so as to improve the question-answer effect of the semantic training model.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.

Claims (10)

1. The artificial intelligence question-answering method based on the semantic training algorithm is characterized by comprising the following steps of: the method comprises the following steps:
s1.1: collecting corpus data;
s1.2: carrying out data preprocessing on the collected corpus data;
s1.3: constructing a semantic training model based on the collected corpus data;
s1.4: and carrying out model optimization training on the constructed semantic training model.
2. The artificial intelligence question-answering method based on semantic training algorithm according to claim 1, wherein: in the step S1.2, error detection correction processing and desensitization processing are carried out on the collected corpus data.
3. The artificial intelligence question-answering method based on semantic training algorithm according to claim 1, wherein: in the step S1.3, the input corpus data is defined, and the definition input is that
Figure FDA0004027565050000011
Where m represents the total turn of the dialog, i.e. [1, m]Represents the ith round of dialog, u i Representing the speech replied to the ith round, defining +.>
Figure FDA0004027565050000012
n i Is the length of the reply utterance of the ith round,/-for the ith round>
Figure FDA0004027565050000013
J's character representing the speech replied to in the ith pass of speech, j's [1, n ] i ]。
4. The artificial intelligence question-answering method based on semantic training algorithm according to claim 3, wherein: in the S1.3, the input of the semantic training model comprises token coding, segment coding and position coding, and the addition of the token coding, the segment coding and the position coding is used as the input, wherein the token coding is as follows
Figure FDA0004027565050000014
The corresponding character embedding table is E t Representing the word segment encoded as +.>
Figure FDA0004027565050000015
The corresponding fragment embedding table is E s S represents the maximum number of fragments, the position coding is +.>
Figure FDA0004027565050000016
The corresponding position embedding table is +.>
Figure FDA0004027565050000017
N represents the sequence length of the entire dialog, i.e. +.>
Figure FDA0004027565050000018
Where m represents the total turn of the dialog, i.e. [1, m]Represents the ith round, n, of the dialog i Is the length of the utterance recovered by the ith round,
the total input of the semantic training model is obtained as follows:
Figure FDA0004027565050000019
wherein ,eij For inputting semantic training model, each position corresponds to an embedded vector;i∈[1,m],j∈[1,n i ]The method comprises the steps of carrying out a first treatment on the surface of the Extracting to obtain the following components:
E ij =transformer(e ij )
wherein ,Eij Representing the output vector for each character of the sequence.
5. The artificial intelligence question-answering method based on semantic training algorithm according to claim 4, wherein: and identifying through a nonlinear classifier by the embedded vector corresponding to each position, and judging whether the embedded vectors at other positions have masks and actual texts corresponding to the masks through mask training.
6. The artificial intelligence question-answering method based on semantic training algorithm according to claim 5, wherein: the mask training is as follows:
for the recovered utterance
Figure FDA0004027565050000021
Predicting characters by a nonlinear character classifier, the formula is as follows:
Figure FDA0004027565050000022
wherein ,
Figure FDA0004027565050000023
representing +.>
Figure FDA0004027565050000024
Predicted value of E tT Transpose of the character-embedded table, b 1 Is a bias parameter of the nonlinear classifier.
7. The artificial intelligence question-answering method based on semantic training algorithm according to claim 6, wherein: in the mask training process, a loss function L of mask training 1 (θ,θ 1 ) Expressed as:
Figure FDA0004027565050000025
wherein θ is encoder parameters of the semantic training model, θ 1 For the non-line character classifier parameter, the number of characters of the input sequence mask is M, m=m' represents the number of mask characters, and V represents the vocabulary size.
8. The artificial intelligence question-answering method based on semantic training algorithm according to claim 7, wherein: for the speech u replied to the ith round i Sequential logic training is performed by first embedding vector E of the ith segment i1 Predicting the turn of the segment in the dialog
Figure FDA0004027565050000026
Figure FDA0004027565050000027
wherein ,
Figure FDA0004027565050000028
for the prediction round of the ith fragment, W 2 Embedding a unit vector of a table for a fragment, b 2 Is a bias parameter of the nonlinear classifier.
9. The artificial intelligence question-answering method based on semantic training algorithm according to claim 8, wherein: in the sequential logic training, a loss function L of the sequential logic training 2 (θ,θ 2 ) Expressed as:
Figure FDA0004027565050000029
wherein θ is semantic trainingEncoder parameters, θ of model 2 Is a parameter of a nonlinear round classifier,
Figure FDA00040275650500000210
for the prediction round of the ith segment, e=e' represents the number of segments of the dialog and S represents the maximum number of segments.
10. The artificial intelligence question-answering method based on semantic training algorithm according to claim 1, wherein: in the step S1.4, a total loss function L of the semantic training model is obtained:
L=L 1 (θ,θ 1 )+L 2 (θ,θ 2 )
training the semantic training model with the minimum loss function as a target, wherein θ is the encoder parameter of the semantic training model, and θ is 1 Is a non-linear character, θ 2 Parameter classifier parameters, L, which are nonlinear round classifiers 1 (θ,θ 1 ) Loss function trained for mask, L 2 (θ,θ 2 ) A penalty function trained for sequential logic.
CN202211711312.4A 2022-12-29 2022-12-29 Artificial intelligence question-answering method based on semantic training algorithm Pending CN116204616A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211711312.4A CN116204616A (en) 2022-12-29 2022-12-29 Artificial intelligence question-answering method based on semantic training algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211711312.4A CN116204616A (en) 2022-12-29 2022-12-29 Artificial intelligence question-answering method based on semantic training algorithm

Publications (1)

Publication Number Publication Date
CN116204616A true CN116204616A (en) 2023-06-02

Family

ID=86508635

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211711312.4A Pending CN116204616A (en) 2022-12-29 2022-12-29 Artificial intelligence question-answering method based on semantic training algorithm

Country Status (1)

Country Link
CN (1) CN116204616A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115358243A (en) * 2022-07-27 2022-11-18 上海浦东发展银行股份有限公司 Training method, device, equipment and storage medium for multi-round dialogue recognition model
CN115391512A (en) * 2022-08-30 2022-11-25 上海浦东发展银行股份有限公司 Training method, device, equipment and storage medium of dialogue language model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115358243A (en) * 2022-07-27 2022-11-18 上海浦东发展银行股份有限公司 Training method, device, equipment and storage medium for multi-round dialogue recognition model
CN115391512A (en) * 2022-08-30 2022-11-25 上海浦东发展银行股份有限公司 Training method, device, equipment and storage medium of dialogue language model

Similar Documents

Publication Publication Date Title
CN110134771B (en) Implementation method of multi-attention-machine-based fusion network question-answering system
CN110781680B (en) Semantic similarity matching method based on twin network and multi-head attention mechanism
CN110134946B (en) Machine reading understanding method for complex data
CN111738016A (en) Multi-intention recognition method and related equipment
CN112101044B (en) Intention identification method and device and electronic equipment
CN114926150B (en) Digital intelligent auditing method and device for transformer technology compliance assessment
CN113742733B (en) Method and device for extracting trigger words of reading and understanding vulnerability event and identifying vulnerability type
CN112800184B (en) Short text comment emotion analysis method based on Target-Aspect-Opinion joint extraction
CN112037773A (en) N-optimal spoken language semantic recognition method and device and electronic equipment
CN113723105A (en) Training method, device and equipment of semantic feature extraction model and storage medium
CN114676255A (en) Text processing method, device, equipment, storage medium and computer program product
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN114239574A (en) Miner violation knowledge extraction method based on entity and relationship joint learning
CN112328748A (en) Method for identifying insurance configuration intention
CN111597816A (en) Self-attention named entity recognition method, device, equipment and storage medium
CN114492460A (en) Event causal relationship extraction method based on derivative prompt learning
CN112488111B (en) Indication expression understanding method based on multi-level expression guide attention network
CN113065352B (en) Method for identifying operation content of power grid dispatching work text
CN116595023A (en) Address information updating method and device, electronic equipment and storage medium
CN116341519A (en) Event causal relation extraction method, device and storage medium based on background knowledge
CN116483314A (en) Automatic intelligent activity diagram generation method
CN109960782A (en) A kind of Tibetan language segmenting method and device based on deep neural network
CN114416991A (en) Method and system for analyzing text emotion reason based on prompt
CN116204616A (en) Artificial intelligence question-answering method based on semantic training algorithm
CN114461779A (en) Case writing element extraction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination