CN115658921A - Open domain scientific knowledge discovery method and device based on pre-training language model - Google Patents

Open domain scientific knowledge discovery method and device based on pre-training language model Download PDF

Info

Publication number
CN115658921A
CN115658921A CN202211392326.4A CN202211392326A CN115658921A CN 115658921 A CN115658921 A CN 115658921A CN 202211392326 A CN202211392326 A CN 202211392326A CN 115658921 A CN115658921 A CN 115658921A
Authority
CN
China
Prior art keywords
prompt
language model
entity
sample data
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211392326.4A
Other languages
Chinese (zh)
Inventor
陈华钧
田玺
毕祯
张宁豫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinhua Fusion Media Technology Development Beijing Co ltd
Zhejiang University ZJU
Original Assignee
Xinhua Fusion Media Technology Development Beijing Co ltd
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xinhua Fusion Media Technology Development Beijing Co ltd, Zhejiang University ZJU filed Critical Xinhua Fusion Media Technology Development Beijing Co ltd
Priority to CN202211392326.4A priority Critical patent/CN115658921A/en
Publication of CN115658921A publication Critical patent/CN115658921A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses an open domain scientific knowledge discovery method and device based on a pre-training language model, wherein an input template comprising a head entity, a first prompt, a second prompt and a tail entity mask is constructed; filling pre-trained embedding of a head entity of each triple containing a target relation, a discrete token of a first prompt corresponding to the target relation and a second prompt token into an input template, and performing mask processing on a tail entity to form input sample data; constructing a single pre-training language model for each target relation, training a mask task on the pre-training language model by using input sample data corresponding to the target relation, and optimizing the embedded representation of the first prompt and the second prompt; and predicting missing entities in the triples by using the optimized embedded expressions of the first prompt words and the second prompt words and the pre-training language model, so that the knowledge discovery efficiency and accuracy of the pre-training language model can be improved.

Description

Open domain scientific knowledge discovery method and device based on pre-training language model
Technical Field
The invention belongs to the field of natural language processing and machine learning, and particularly relates to an open domain scientific knowledge discovery method and device based on a pre-training language model.
Background
The development of the pre-training language model promotes the research in the field of natural language processing to a new stage, universal language representation can be learned from massive linguistic data without manual labeling, and downstream tasks are remarkably promoted. Some studies have shown that pre-trained language models have certain capabilities to store information and answer questions, and different kinds of knowledge are implicit in their parameters, and this knowledge acquisition is crucial for the language models in various downstream tasks. However, just like most nervous systems, knowledge in pre-trained language models is encoded in a diffuse manner, which makes it often difficult to interpret and update.
In view of the development of the pre-trained language model, the pre-trained language model is widely used for the supplementary construction of the scientific knowledge graph in the scientific field. The scientific knowledge map contains various scientific knowledge recorded by triplets (head entities, relations and tail entities), wherein the head entities and the tail entities can be entities in various scientific fields of biomedicine, chemistry and the like such as diseases, medicines, genes, molecules and the like, and the relations are inclusion, action, types and the like.
Knowledge learned during pre-training can be potentially acquired through fine-tuning or prompting, and prompting is an effective method for directly acquiring the knowledge without any addition. The prompts can be divided into manually created prompts and automatically learned prompts. Although manually created prompts are intuitive and do allow a degree of triple completion tasks, this approach requires time and experience, especially for some complex completion tasks, even an experienced prompt designer may not be able to manually find the best prompt. The automatic learning prompt realizes the automation of the design process of the prompt template, but fails to fully capture scientific term information.
Patent document CN114706943A discloses an intention recognition method, which performs intention recognition on an input text added with a prompt by using a pre-trained language model. The added prompts do not capture scientific terminology information, resulting in poor accuracy of the embedded knowledge mined during the learning process.
Patent document CN114661913A discloses an entity relationship extraction method based on a pre-training language model, which solves the problem of poor knowledge mining efficiency due to manual participation of prompt template labeling in a knowledge mining scheme by screening prompt templates, but the method still does not capture scientific term information and also leads to poor accuracy of embedded knowledge mined in a learning process.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a method and an apparatus for discovering open domain scientific knowledge based on a pre-trained language model, which fully consider external scientific knowledge and learnable knowledge representation, can better acquire knowledge in the pre-trained language model, achieve the capability of better detecting knowledge in the pre-trained language model, and improve the knowledge discovery efficiency and accuracy of the pre-trained language model.
In order to achieve the above object, an embodiment provides a method for discovering open domain scientific knowledge based on a pre-training language model, including the following steps:
extracting triples (head entities, relations and tail entities) from the scientific knowledge map, and constructing a first prompt and a second prompt for each relation by using external scientific knowledge;
constructing an input template of a pre-training language model, wherein the input template comprises a head entity, a first prompt, a second prompt and a tail entity mask;
filling a head entity of each triple, a first prompt corresponding to the target relation and a second prompt corresponding to the target relation into an input template by taking all the triples containing the target relation as sample data, and processing a tail entity mask to form input sample data;
constructing a single pre-training language model for each target relation, training a mask task on the pre-training language model by using input sample data corresponding to the target relation, and optimizing the embedded representation of the first prompt and the second prompt;
and predicting missing entities in the triples containing the target relationship by using the optimized embedded expression of the first prompt and the second prompt and the pre-training language model, so as to realize the discovery of open domain scientific knowledge.
Preferably, a first cue and a second cue constructed by using external scientific knowledge are related to the relation in the triples, the first cue is used as discrete tokens in the input template, and the second cue is used as initialization tokens of a continuous space vector in the input template.
Preferably, the first cue filling corresponding to the target relationship exists in the input template in a discrete form as a discrete token of the target relationship.
Preferably, the second cue filling corresponding to the target relationship exists in the input template in the form of a continuous vector, as the initialization tokens of the continuous vector of the target relationship.
Preferably, the pre-trained embedding is used to perform embedded representation on tokens in the second cue corresponding to the target relationship, and the embedded representation is used as initialization of a continuous vector of the target relationship.
Preferably, input sample data corresponding to the target relationship is used for training a mask task on the pre-training language model, parameters of the pre-training language model are kept fixed, the negative logarithm possibility of the input sample data is minimized by adopting the following loss function and a gradient correction method, and the embedded representation of the first prompt and the second prompt is updated;
Figure BDA0003931898700000031
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003931898700000032
represents a loss function, D r Representing object relationshipsr set of head-to-tail pairs of entities (h, t), t r (h) Representing input sample data, P (([ MASK) formed by filling the head entity and tail entity pairs (h, t) containing the target relation r into the input template]=t)|t r (h) ) represents t r (h) MASK input to Pre-training language model output]Equal to the probability value of the tail entity t.
Preferably, the predicting of the missing entities in the triples containing the target relationship by using the optimized embedded representations of the first cue and the second cue and the pre-trained language model comprises:
taking a known entity in the triple as a head entity, and filling the head entity, the optimized first prompt and the embedded representation of the second prompt into an input template to form input sample data;
and inputting input sample data into the pre-training language model, and outputting the prediction probability of the missing entity in the triple through calculation.
In order to achieve the above object, an embodiment provides an apparatus for discovering open domain scientific knowledge based on a pre-training language model, including:
the external knowledge acquisition module is used for extracting triples (head entities, relations and tail entities) from the scientific knowledge map and constructing a first prompt and a second prompt for each relation by using external scientific knowledge;
the input template building module is used for building an input template of a pre-training language model, and the input template comprises a head entity, a first prompt, a second prompt and a tail entity mask;
the input sample data construction module is used for filling the head entity of each triple, the first prompt corresponding to the target relation and the second prompt into the input template, and processing the tail entity mask to form input sample data by taking all the triples containing the target relation as the sample data;
the training module is used for constructing a single pre-training language model for each target relation, performing mask task training on the pre-training language model by using input sample data corresponding to the target relation, and optimizing the embedded representation of the first prompt and the second prompt;
and the application module is used for predicting missing entities in the triples containing the target relation by using the optimized embedded representation of the first prompt language and the second prompt language and the pre-training language model, so that the discovery of open domain scientific knowledge is realized.
Compared with the prior art, the invention has the beneficial effects that at least:
when the input sample data is constructed, external scientific knowledge is introduced to serve as a first prompt word and a second prompt word of a target relation, so that the input sample data contains more scientifically relevant semantic information, when the pre-training language model is trained, the semantic information of a continuous space is integrated, the experience of the artificial prompt words is not completely relied on, the knowledge information in the pre-training language model can be captured more effectively, and the knowledge discovery efficiency and accuracy of the pre-training language model are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart of an open domain scientific knowledge discovery method based on a pre-trained language model provided by an embodiment;
FIG. 2 is a flowchart of an open domain scientific knowledge discovery method based on a pre-trained language model according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
In order to enable the pre-training language model to better learn knowledge, the knowledge detection capability in the language model is better achieved, and the knowledge discovery efficiency and accuracy of the pre-training language model are improved. The embodiment provides an open domain scientific knowledge discovery method and device based on a pre-training language model.
As shown in fig. 1, the method for discovering open domain scientific knowledge based on a pre-trained language model provided by the embodiment includes the following steps:
step 1, extracting triples from a scientific knowledge map, classifying the triples according to relations, and constructing a first prompt and a second prompt for each relation by using external scientific knowledge.
In the embodiment, the triple inclusion (head entity h, relation r and tail entity t) extracted from the scientific knowledge graph contains various scientific knowledge, wherein the head entity comprises biomedical and chemical field entities such as diseases, medicines, genes and molecules, the relation comprises inclusion, action, type and the like, and the tail entity comprises biomedical and chemical field entities such as diseases, medicines, genes and molecules.
In the embodiment, two prompts are constructed for each relationship by using external scientific knowledge, wherein the first prompt t is r1 And a second prompt t r2 Related to the relation r in the triplet, it may be equal or unequal. The first cue is used as discrete tokens of the relation in the input template, and the second cue is used as initialization tokens of the continuous space vector of the relation in the input template.
And 2, constructing an input template of the pre-training language model, wherein the input template comprises a head entity, a first prompt, a second prompt and a tail entity mask.
The objective facts are not independent from each other, and the data-driven generated prompts can be used for extracting knowledge in the pre-training language model by referring to the distribution of knowledge in the training set, and even can be used for recovering the objective facts in the pre-training language model initialized randomly. Some previous methods (such as AutoPrompt) search out the better K candidate prompts in a discrete vocabulary, and then pick and verify, where the search space is limited to a discrete space. If replaced with a continuous vector, it may not be a true token, but a continuous vector representation, where the scientific terminology information of a discrete token is missing. To this end, the embodiment uses scientific terms to construct the first hints for use as discrete tokens, and also adds continuous space vectors according to the second hints for use as initialization tokens, and then masks tokens of the triplet tail entities to predict these masked tokens. Thus, the constructed input template is:
t r =[X][Term] 1 …[Term] m [P] 1 …[P] n [MASK]
wherein, [ X ]]Is the head entity of the triplet, [ Term] 1 …[Term] m Discrete tokens, i.e., t, representing scientific terms related to the relationship r r1 M is t r1 The number of the middle tokens is such that,
Figure BDA0003931898700000071
is a continuous vector in the vector space, d denotes the dimension of the embedded vector, n is t r2 Number of intermediate tokens, [ MASK ]]Is a masked tail entity.
And 3, filling the input template with the head entity of each triple, the first prompt corresponding to the target relationship and the second prompt corresponding to the target relationship by taking all the triples containing the target relationship as sample data, and processing the tail entity mask to form the input sample data.
In the embodiment, the concerned relation is used as a target relation, and all triples containing the target relation are used as sample data of a pre-training language model corresponding to the target relation. The sample data is converted into input sample data through the input template. Specifically, the first cue words and the second cue words corresponding to the target relationships fill the input template, and in the input template, the first cue words filling corresponding to the target relationships exist in a discrete form and serve as discrete tokens of the target relationships. And filling the second cue words corresponding to the target relation in a continuous vector form, and taking the second cue words as the initialization tokens of the continuous vector of the target relation.
In this challenging non-convex optimization problem, good initialization continuous vectors are important. Thus, embodiments contemplate a more complex form of using a second prompt constructed manually to determine the number n and location of the continuous vectors [ P ] for each target relationship, and initializing [ P ] using pre-trained embedding of tokens in the second prompt.
And meanwhile, filling the head entity into the input template, and performing mask processing on the tail entity to form input sample data.
And 4, constructing a single pre-training language model for each target relation, training a mask task on the pre-training language model by using input sample data corresponding to the target relation, and optimizing the embedded representation of the first prompt and the second prompt.
In the embodiment, each target relation corresponds to one pre-training language model, input sample data corresponding to the target relation is used for training a mask task on the pre-training language model, parameters of the pre-training language model are kept fixed, and the negative logarithm possibility of the input sample data is minimized by adopting the following loss function and a gradient correction method so as to update the embedded expression of the first prompt and the second prompt;
Figure BDA0003931898700000081
wherein the content of the first and second substances,
Figure BDA0003931898700000082
representing a loss function, D r Set of head-entity-tail-entity pairs (h, t) representing a target relation r, t r (h) Representing input sample data, P (([ MASK) formed by filling the head entity and tail entity pairs (h, t) containing the target relation r into the input template]=t)|t r (h) ) represents t r (h) MASK input to Pre-training language model output]Equal to the probability value of the tail entity t.
And exploring knowledge information of the pre-trained language model through a knowledge mask task of the pre-trained language model, and optimizing the embedded representation of the first prompt and the second prompt, wherein the embedded representation is more favorable for predicting the missing entities in the triples.
And 5, predicting missing entities in the triples containing the target relation by using the optimized embedded expression of the first prompt and the second prompt and the pre-training language model, so as to realize the discovery of open domain scientific knowledge.
Predicting missing entities in triples containing target relationships by using the optimized embedded representations of the first prompt and the second prompt and a pre-trained language model, wherein the predicting comprises the following steps:
and taking the known entity in the triple as a head entity, and filling the head entity, the optimized first prompt and the embedded representation of the second prompt into an input template to form input sample data, wherein the input sample data is a complete filling type statement.
Inputting input sample data into a pre-training language model, outputting the prediction probability of the missing entities in the triples through calculation, and screening the missing entities according to the prediction probability, wherein the missing entities are used for perfecting the missing triples to form complete triples, so that the discovery of open domain scientific knowledge is realized.
Based on the same inventive concept, as shown in fig. 2, an embodiment further provides an open domain scientific knowledge discovery apparatus based on a pre-trained language model, including:
the external knowledge acquisition module is used for extracting triples from the scientific knowledge map, classifying the triples according to the relations and constructing a first prompt and a second prompt for each relation by using external scientific knowledge;
the input template building module is used for building an input template of a pre-training language model, and the input template comprises a head entity, a first prompt, a second prompt and a tail entity mask;
the input sample data construction module is used for filling the head entity of each triple, the first prompt corresponding to the target relation and the second prompt into the input template, and processing the tail entity mask to form input sample data by taking all the triples containing the target relation as the sample data;
the training module is used for constructing a single pre-training language model for each target relation, performing mask task training on the pre-training language model by using input sample data corresponding to the target relation, and optimizing the embedded representation of the first prompt and the second prompt;
and the application module is used for predicting missing entities in the triples containing the target relationship by using the embedded expression of the optimized first prompt language and the optimized second prompt language and the pre-training language model, so that the discovery of open-domain scientific knowledge is realized.
It should be noted that, when the apparatus for discovering open domain scientific knowledge based on a pre-trained language model provided in the foregoing embodiment performs open domain scientific knowledge discovery based on a pre-trained language model, the division of the functional modules is taken as an example, and the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the terminal or the server is divided into different functional modules to complete all or part of the functions described above. In addition, the device for discovering open domain scientific knowledge based on the pre-trained language model provided by the above embodiment and the method for discovering open domain scientific knowledge based on the pre-trained language model have the same concept, and the specific implementation process is detailed in the method for discovering open domain scientific knowledge based on the pre-trained language model, and is not described herein again.
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims (8)

1. A method for discovering open domain scientific knowledge based on a pre-training language model is characterized by comprising the following steps:
extracting triples (head entities, relations and tail entities) from the scientific knowledge map, and constructing a first prompt and a second prompt for each relation by using external scientific knowledge;
constructing an input template of a pre-training language model, wherein the input template comprises a head entity, a first prompt, a second prompt and a tail entity mask;
filling a head entity of each triple, a first prompt corresponding to the target relation and a second prompt corresponding to the target relation into an input template by taking all the triples containing the target relation as sample data, and processing a tail entity mask to form input sample data;
constructing a single pre-training language model for each target relation, training a mask task on the pre-training language model by using input sample data corresponding to the target relation, and optimizing the embedded representation of the first prompt and the second prompt;
and predicting missing entities in the triples containing the target relationship by using the optimized embedded expression of the first prompt and the second prompt and the pre-training language model, so as to realize the discovery of open domain scientific knowledge.
2. The method of claim 1, wherein the first and second cues constructed from external scientific knowledge are related to the relationship in the triplet, the first cue is a discrete token in the input template, and the second cue is an initialization token of a continuous space vector in the input template.
3. The method for discovering open domain scientific knowledge based on pre-trained language model according to claim 1, wherein the first cue filling corresponding to the target relationship exists in the input template in a discrete form as a discrete tokens of the target relationship.
4. The method for discovering open domain scientific knowledge based on pre-trained language model according to claim 1, wherein the second prompt filling corresponding to the target relationship exists in the input template in the form of continuous vectors as initialization tokens of the continuous vectors of the target relationship.
5. The method for discovering open domain scientific knowledge based on pre-trained language model according to claim 4, characterized in that the embedded representation of tokens in the second prompt corresponding to the target relationship is performed by using pre-trained embedding, and the embedded representation is used as the initialization of continuous vectors of the target relationship.
6. The method for discovering open domain scientific knowledge based on a pre-trained language model according to claim 1, wherein the pre-trained language model is trained for a masking task by using input sample data corresponding to a target relationship, parameters of the pre-trained language model are kept fixed, and the negative logarithm possibility of the input sample data is minimized by using the following loss function and a gradient correction method to update the embedded representation of the first prompt and the second prompt;
Figure FDA0003931898690000021
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003931898690000022
representing a loss function, D r Set of head-entity-tail-entity pairs (h, t) representing a target relation r, t r (h) Representing input sample data, P (([ MASK) formed by filling the head entity and tail entity pairs (h, t) containing the target relation r into the input template]=t)|t r (h) ) represents t r (h) MASK input to Pre-training language model output]Equal to the probability value of the tail entity t.
7. The method for discovering open domain scientific knowledge based on pre-trained language model according to claim 1, wherein the predicting of the missing entities in the triples containing the target relationship by using the optimized embedded representations of the first and second hints and the pre-trained language model comprises:
taking a known entity in the triple as a head entity, and filling the head entity, the optimized first prompt and the embedded representation of the second prompt into an input template to form input sample data;
inputting input sample data into a pre-training language model, and outputting the prediction probability of the missing entity in the triple through calculation.
8. An open domain scientific knowledge discovery apparatus based on a pre-trained language model, comprising:
the external knowledge acquisition module is used for extracting triples (head entities, relations and tail entities) from the scientific knowledge map and constructing a first prompt and a second prompt for each relation by using external scientific knowledge;
the input template building module is used for building an input template of a pre-training language model, and the input template comprises a head entity, a first prompt, a second prompt and a tail entity mask;
the input sample data construction module is used for filling the head entity of each triple, the first prompt corresponding to the target relation and the second prompt into the input template, and processing the tail entity mask to form input sample data by taking all the triples containing the target relation as the sample data;
the training module is used for constructing a single pre-training language model for each target relation, performing mask task training on the pre-training language model by using input sample data corresponding to the target relation, and optimizing the embedded representation of the first prompt and the second prompt;
and the application module is used for predicting missing entities in the triples containing the target relation by using the optimized embedded representation of the first prompt language and the second prompt language and the pre-training language model, so that the discovery of open domain scientific knowledge is realized.
CN202211392326.4A 2022-11-08 2022-11-08 Open domain scientific knowledge discovery method and device based on pre-training language model Pending CN115658921A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211392326.4A CN115658921A (en) 2022-11-08 2022-11-08 Open domain scientific knowledge discovery method and device based on pre-training language model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211392326.4A CN115658921A (en) 2022-11-08 2022-11-08 Open domain scientific knowledge discovery method and device based on pre-training language model

Publications (1)

Publication Number Publication Date
CN115658921A true CN115658921A (en) 2023-01-31

Family

ID=85016304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211392326.4A Pending CN115658921A (en) 2022-11-08 2022-11-08 Open domain scientific knowledge discovery method and device based on pre-training language model

Country Status (1)

Country Link
CN (1) CN115658921A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117725223A (en) * 2023-11-20 2024-03-19 中国科学院成都文献情报中心 Knowledge discovery-oriented scientific experiment knowledge graph construction method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117725223A (en) * 2023-11-20 2024-03-19 中国科学院成都文献情报中心 Knowledge discovery-oriented scientific experiment knowledge graph construction method and system

Similar Documents

Publication Publication Date Title
CN113987209B (en) Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment
CN111581361B (en) Intention recognition method and device
CN110737758A (en) Method and apparatus for generating a model
CN107943784B (en) Relationship extraction method based on generation of countermeasure network
CN110750959A (en) Text information processing method, model training method and related device
CN112905795A (en) Text intention classification method, device and readable medium
CN111062217A (en) Language information processing method and device, storage medium and electronic equipment
CN113987179A (en) Knowledge enhancement and backtracking loss-based conversational emotion recognition network model, construction method, electronic device and storage medium
CN110795544B (en) Content searching method, device, equipment and storage medium
CN117033571A (en) Knowledge question-answering system construction method and system
CN112349294B (en) Voice processing method and device, computer readable medium and electronic equipment
CN112101042A (en) Text emotion recognition method and device, terminal device and storage medium
CN112418320A (en) Enterprise association relation identification method and device and storage medium
CN113742733A (en) Reading understanding vulnerability event trigger word extraction and vulnerability type identification method and device
CN115858750A (en) Power grid technical standard intelligent question-answering method and system based on natural language processing
CN110929532B (en) Data processing method, device, equipment and storage medium
CN115759254A (en) Question-answering method, system and medium based on knowledge-enhanced generative language model
CN117744754B (en) Large language model task processing method, device, equipment and medium
CN114880307A (en) Structured modeling method for knowledge in open education field
CN117151222B (en) Domain knowledge guided emergency case entity attribute and relation extraction method thereof, electronic equipment and storage medium
CN116522165B (en) Public opinion text matching system and method based on twin structure
CN113705207A (en) Grammar error recognition method and device
CN112434133B (en) Intention classification method and device, intelligent terminal and storage medium
CN115713082A (en) Named entity identification method, device, equipment and storage medium
CN115658921A (en) Open domain scientific knowledge discovery method and device based on pre-training language model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination