CN113033206A - Bridge detection field text entity identification method based on machine reading understanding - Google Patents
Bridge detection field text entity identification method based on machine reading understanding Download PDFInfo
- Publication number
- CN113033206A CN113033206A CN202110357215.9A CN202110357215A CN113033206A CN 113033206 A CN113033206 A CN 113033206A CN 202110357215 A CN202110357215 A CN 202110357215A CN 113033206 A CN113033206 A CN 113033206A
- Authority
- CN
- China
- Prior art keywords
- embedding
- character
- text
- word
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3334—Selection or weighting of terms from queries, including natural language queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/374—Thesaurus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
- Character Discrimination (AREA)
Abstract
The invention discloses a method for recognizing text entities in the field of bridge detection based on machine reading understanding, which comprises the following steps: s1, acquiring a question text and a target text; s2, extracting character embedding, binary character embedding and weighted word embedding from the question text and the target text; s3, embedding characters, embedding binary characters and embedding and splicing weighted words to obtain joint feature expression; and S4, inputting the combined feature expression into a neural network to complete entity identification. Because the character Embedding only extracts the characteristics at the level of the context character, in order to extract the characteristics with richer semanteme, the invention pertinently introduces the external dictionary information to enhance the characteristic expression of model input, namely introduces a binary Word Embedding (Bigram Embedding) unit and a Weighted Word Embedding (Weighted Word Embedding) unit trained by large-scale corpus, thereby leading the effect of entity identification to be better.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method for recognizing text entities in the field of bridge detection based on machine reading understanding.
Background
For years, natural language processing has been an important research direction in the field of artificial intelligence, one of core research tasks, and a great deal of research is carried out under the development of machine learning and deep learning, so that the development is great. However, the application research of the intelligent decision-making in the field of bridge health management and maintenance based on the natural language processing technology is rarely developed, and the bridge is influenced by the traffic load, the environmental excitation, the emergency, the property degradation of the bridge structure material and other internal and external factors in the long-term operation process, so that various diseases of the structure parts are inevitable. Meanwhile, an operation-period bridge health management business system consisting of daily inspection maintenance, frequent detection, regular/special detection, load test, maintenance reinforcement, structural health monitoring and the like is formed in the bridge industry at present, massive bridge health management historical data are accumulated, and the characteristics of obvious data multi-source isomerism, high-speed increase of data volume and the like are presented. However, various health management and maintenance information is still stored in a relational database in a document link mode, when services such as bridge structure state evaluation or management and maintenance decision support are carried out, related documents are still mainly consulted manually, and massive fine-grained structures and disease information are scattered in unstructured texts and need to be identified and extracted. Therefore, based on the application of natural language processing technology, the problem of intelligent aid decision support in the field of bridge management and maintenance still needs to be further solved.
At present, with the rapid development of deep learning, a deep neural network model based on end-to-end becomes mature day by day and becomes a main method and trend of natural language processing problems, and the problems that the performance of a traditional machine learning model depends on feature engineering seriously, the context feature representation capability is insufficient and the like are solved. The named entity recognition is always the fundamental important content of research in the field of natural language processing, and the essence of the named entity recognition is to extract predefined valuable information from semi-structured text or unstructured text information, store the information in a semi-structured mode, and support intelligent applications such as knowledge maps, automatic question answering and the like. Aiming at a bridge detection text named entity recognition task, only a named entity recognition method based on a bridge Onto ontology and semi-supervised CRF (conditional Random fields) for structural state and maintenance activities is provided at present.
Therefore, the problem of how to effectively utilize prior information and the problem of the nested entities is not considered in related research, and the outer layer of the named entity and the nested identification suitable for the description characteristics of the Chinese bridge detection report still need to be further researched.
Disclosure of Invention
Aiming at the defects in the prior art, the invention discloses a method for recognizing text entities in the bridge detection field based on machine reading understanding, which purposefully introduces external dictionary information to enhance the feature expression of model input on the basis of character Embedding, namely introduces a binary Word Embedding (Bigram Embedding) unit and a Weighted Word Embedding (Weighted Word Embedding) unit trained by large-scale linguistic data, thereby ensuring better entity recognition effect.
In order to solve the technical problems, the invention adopts the following technical scheme:
a bridge detection field text entity identification method based on machine reading understanding comprises the following steps:
s1, acquiring a question text and a target text;
s2, extracting character embedding, binary character embedding and weighted word embedding from the question text and the target text;
s3, embedding characters, embedding binary characters and embedding and splicing weighted words to obtain joint feature expression;
and S4, inputting the combined feature expression into a neural network to complete entity identification.
Preferably, the method for extracting character embedding in step S2 includes:
serializing the question text into Q ═ Q1,q2,...,qm],qiRepresenting ith character in question text, and representing target text in a serialized mode of C ═ C1,c2,...,c],ciRepresenting the ith character in the data text;
q and C are connected in series to form X ═ X1,x2,...,xl],xiBelongs to Q ═ C and l ═ m + n;
operation of searching character embedding table is carried out to obtain vector matrix capable of being input into BERT modelIth element of Ewc(xi) Representing a character xiEmbedding characters in table wcThe vector representation of (1); d represents the dimension of each character vector in the character embedding table;
the vector matrix E is subjected to character embedding, wherein the ith character in the character embedding iswbertA character embedding table representing the BERT model.
Preferably, the method for extracting the weighted word embedding in step S2 includes:
the four sets of B, M, E, S are constructed as follows
Wherein D represents an external dictionary, wi,kRepresenting a subsequence [ X ] in an input sequence Xi,xi+1,...,xk],B(xi) Representing a sub-sequence w matched in an external dictionary Di,kMiddle character xiIs wi,kThe start character of (a); m (x)i) Representing a sub-sequence w matched in an external dictionary Di,kMiddle character xiIs wi,kThe middle character of (1); e (x)i) Representing a sub-sequence w matched in an external dictionary Di,kMiddle character xiIs wi,kThe end character of (1); s (x)i) Representing the current character matched in the external dictionary D, and if the four sets have the condition of empty matching, filling by using a word NONE;
constructing weighted word embeddings as follows
In the formula (I), the compound is shown in the specification,indicating the ith character, v, in the embedding of the weighted words(B)、vs(M)、vs(E)、vs(S) represents B, M, E, S corresponding weighted representations, respectively.
Preferably, the weighted representation v of the set of words L is calculated ass(D)
Wherein z (w) represents the frequency of appearance of the word w in the external dictionary D, ωword(w) a word-embedding representation of the vocabulary w found in the word-embedding table, Z representing a set of word frequencies, Z ═ Σw∈B∪M∪E∪Sz(w)。
Preferably, step S4 includes:
inputting the combined feature expression into a neural network to extract feature information;
and predicting the character probability and the entity span of the characteristic information to complete entity identification.
Preferably, the neural network is BilSTM.
In summary, compared with the prior art, the invention has the following technical effects:
(1) because the character Embedding only extracts the characteristics at the level of the context character, in order to extract the characteristics with richer semanteme, the invention pertinently introduces the external dictionary information to enhance the characteristic expression of model input, namely introduces a binary Word Embedding (Bigram Embedding) unit and a Weighted Word Embedding (Weighted Word Embedding) unit trained by large-scale corpus, thereby leading the effect of entity identification to be better.
(2) Because the BERT pre-training model only supports character-level input for Chinese, the obtained initial vector matrix E is used as the input of the BERT pre-training model, the BERT model can perform position coding on the initial vector matrix E and simultaneously extract more accurate context-related semantic information by adopting a multi-head attention mechanism, and the character-level vector representation output by training and adjusting the BERT pre-training model is more suitable for the context of the bridge detection report text through tiny adjustment.
(3) The method for generating the weighted word embedding not only can well introduce the word embedding in the dictionary, but also has no loss of context information because the matching result can be accurately recovered from the four character sets.
(4) The weighting algorithm does not adopt a dynamic weighting algorithm such as an attention mechanism, but adopts a static value of the occurrence frequency of the vocabularies, because the occurrence frequency of the vocabularies can be obtained by statistics in advance, and the method can greatly accelerate the speed of calculating the weight of each vocabulary.
(5) In the invention, the probability of the character as the start index and the end index and the probability of the whole entity span are calculated, and the method has the advantages that the outgoing entity and the nested entity contained in the text can be decoded simultaneously, and the final answer is obtained.
(6) The invention adopts a BilSTM (Bidirectional Long Short-Term Memory) model for coding and extracts richer context Bidirectional characteristic information.
Drawings
FIG. 1 is a flowchart of an embodiment of a bridge inspection field text entity recognition method based on machine reading understanding disclosed in the present invention;
FIG. 2 is an overall architecture diagram of another embodiment of a bridge inspection field text entity recognition method based on machine reading understanding;
fig. 3 is a diagram illustrating an example of a weighted word embedding unit.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings.
As shown in fig. 1 and 2, the invention discloses a method for recognizing a text entity in a bridge detection field based on machine reading understanding, which comprises the following steps:
s1, acquiring a question text and a target text;
s2, extracting character embedding, binary character embedding and weighted word embedding from the question text and the target text;
s3, embedding characters, embedding binary characters and embedding and splicing weighted words to obtain joint feature expression;
and S4, inputting the combined feature expression into a neural network to complete entity identification.
The problem text in the present invention includes a priori information. Because the character Embedding only extracts the characteristics at the level of the context character, in order to extract the characteristics with richer semanteme, the invention pertinently introduces the external dictionary information to enhance the characteristic expression of model input, namely introduces a binary Word Embedding (Bigram Embedding) unit and a Weighted Word Embedding (Weighted Word Embedding) unit trained by large-scale corpus, thereby leading the effect of entity identification to be better.
In specific implementation, the method for extracting and embedding characters in step S2 includes:
serializing the question text into Q ═ Q1,q2,...,qm],qiRepresenting ith character in question text, and representing target text in a serialized mode of C ═ C1,c2,...,c],ciRepresenting dataThe ith character in the text;
q and C are connected in series to form X ═ X1,x2,...,xl],xiBelongs to Q ═ C and l ═ m + n;
operation of searching character embedding table is carried out to obtain vector matrix capable of being input into BERT modelIth element of Ewc(xi) Representing a character xiEmbedding characters in table wcThe vector representation of (1); d represents the dimension of each character vector in the character embedding table;
the vector matrix E is subjected to character embedding, wherein the ith character in the character embedding iswbertA character embedding table representing the BERT model.
Because the BERT pre-training model only supports character-level input for Chinese, the obtained initial vector matrix E is used as the input of the BERT pre-training model, the BERT model can perform position coding on the initial vector matrix E and simultaneously extract more accurate context-related semantic information by adopting a multi-head attention mechanism, and the character-level vector representation output by training and adjusting the BERT pre-training model is more suitable for the context of the bridge detection report text through tiny adjustment.
The introduction of binary word embedding well copes with the problem of characterization of different entities composed of the same character. In the field of bridge detection, there are a large number of entity expressions composed of two characters, such as "bridge pier", "bridge abutment", "abutment cap", "abutment body", etc., and it is easy to find that most of the entity composed of two characters usually contains the same character, such as "bridge", "pier", "abutment", etc., although the same character, the semantic information expressed in different entities is different. Thus, the input character is converted into binaryWord-embedded expressions to enhance the semantic expression of input data on both entities and non-entities, where wbRepresenting a binary word embedding table. The ith character of the binary word is
As shown in fig. 3, in specific implementation, the method for extracting weighted word embedding in step S2 includes:
the four sets of B, M, E, S are constructed as follows
Wherein D represents an external dictionary, wi,kRepresenting a subsequence [ X ] in an input sequence Xi,xi+1,...,xk],B(xi) Representing a sub-sequence w matched in an external dictionary Di,kMiddle character xiIs wi,kThe start character of (a); m (x)i) Representing a sub-sequence w matched in an external dictionary Di,kMiddle character xiIs wi,kThe middle character of (1); e (x)i) Representing a sub-sequence w matched in an external dictionary Di,kMiddle character xiIs wi,kThe end character of (1); s (x)i) Representing the current character matched in the external dictionary D, and if the four sets have the condition of empty matching, filling by using a word NONE;
constructing weighted word embeddings as follows
In the formula (I), the compound is shown in the specification,indicating the ith character, v, in the embedding of the weighted words(B)、vs(M)、vs(E)、vs(S) represents B, M, E, S corresponding weighted representations, respectively.
The method for generating the weighted word embedding not only can well introduce the word embedding in the dictionary, but also has no loss of context information because the matching result can be accurately recovered from the four character sets.
In specific implementation, the weighted expression v of the word set L is calculated according to the following formulas(L)
Wherein z (w) represents the frequency of appearance of the word w in the external dictionary D, ωword(w) a word-embedding representation of the vocabulary w found in the word-embedding table, Z representing a set of word frequencies, Z ═ Σw∈B∪M∪E∪Sz(w)。
L is regenerated according to the old dictionary, and the biggest difference between L and the old dictionary is that the embedding of words is weighted by the operation of the formula.
The weighting algorithm does not adopt a dynamic weighting algorithm such as an attention mechanism, but adopts a static value of the occurrence frequency of the vocabularies, because the occurrence frequency of the vocabularies can be obtained by statistics in advance, and the method can greatly accelerate the speed of calculating the weight of each vocabulary.
In specific implementation, step S4 includes:
inputting the combined feature expression into a neural network to extract feature information;
and predicting the character probability and the entity span of the characteristic information to complete entity identification.
In the invention, the probability of the character as the start index and the end index and the probability of the whole entity span are calculated, and the method has the advantages that the outgoing entity and the nested entity contained in the text can be decoded simultaneously, and the final answer is obtained.
In specific implementation, the neural network is BilSTM.
The invention adopts a BilSTM (Bidirectional Long Short-Term Memory) model for coding and extracts richer context Bidirectional characteristic information.
Finally, it is noted that the above-mentioned embodiments illustrate rather than limit the invention, and that, while the invention has been described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (6)
1. A bridge detection field text entity identification method based on machine reading understanding is characterized by comprising the following steps:
s1, acquiring a question text and a target text;
s2, extracting character embedding, binary character embedding and weighted word embedding from the question text and the target text;
s3, embedding characters, embedding binary characters and embedding and splicing weighted words to obtain joint feature expression;
and S4, inputting the combined feature expression into a neural network to complete entity identification.
2. The bridge detection field text entity recognition method based on machine reading understanding of claim 1, wherein the method for extracting character embedding in step S2 comprises:
serializing the question text into Q ═ Q1,q2,...,qm],qiRepresenting ith character in question text, and representing target text in a serialized mode of C ═ C1,c2,...,c],ciNumber of representationsAccording to the ith character in the text;
q and C are connected in series to form X ═ X1,x2,...,xl],xiBelongs to Q ═ C and l ═ m + n;
operation of searching character embedding table is carried out to obtain vector matrix capable of being input into BERT modelIth element of Ewc(xi) Representing a character xiEmbedding characters in table wcThe vector representation of (1); d represents the dimension of each character vector in the character embedding table;
3. The bridge detection field text entity recognition method based on machine reading understanding of claim 2, wherein the method for extracting the weighted word embedding in the step S2 comprises:
the four sets of B, M, E, S are constructed as follows
Wherein D represents an external dictionary, wi,kRepresenting a subsequence [ X ] in an input sequence Xi,xi+1,...,xk],B(xi) Representing a sub-sequence w matched in an external dictionary Di,kMiddle character xiIs wi,kThe start character of (a); m (x)i) Representing a sub-sequence w matched in an external dictionary Di,kMiddle character xiIs wi,kThe middle character of (1); e (x)i) Representing a sub-sequence w matched in an external dictionary Di,kMiddle character xiIs wi,kThe end character of (1); s (x)i) Representing the current character matched in the external dictionary D, and if the four sets have the condition of empty matching, filling by using a word NONE;
constructing weighted word embeddings as follows
4. The bridge detection field text entity recognition method based on machine-readable understanding of claim 3, wherein the weighted representation v of the set of words L is calculated as followss(L)
Wherein z (w) represents the frequency at which the word w appears in the external dictionary DRate, ωword(w) a word-embedding representation of the vocabulary w found in the word-embedding table, Z representing a set of word frequencies, Z ═ Σw∈B∪M∪E∪Sz(w)。
5. The bridge inspection field text entity recognition method based on machine-readable understanding of any one of claims 1 to 4, wherein the step S4 includes:
inputting the combined feature expression into a neural network to extract feature information;
and predicting the character probability and the entity span of the characteristic information to complete entity identification.
6. The bridge inspection field text entity recognition method based on machine-readable understanding of claim 5, wherein the neural network is BilSTM.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110357215.9A CN113033206B (en) | 2021-04-01 | 2021-04-01 | Bridge detection field text entity identification method based on machine reading understanding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110357215.9A CN113033206B (en) | 2021-04-01 | 2021-04-01 | Bridge detection field text entity identification method based on machine reading understanding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113033206A true CN113033206A (en) | 2021-06-25 |
CN113033206B CN113033206B (en) | 2022-04-22 |
Family
ID=76453874
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110357215.9A Active CN113033206B (en) | 2021-04-01 | 2021-04-01 | Bridge detection field text entity identification method based on machine reading understanding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113033206B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113935324A (en) * | 2021-09-13 | 2022-01-14 | 昆明理工大学 | Cross-border national culture entity identification method and device based on word set feature weighting |
CN115879474A (en) * | 2023-02-14 | 2023-03-31 | 华东交通大学 | Fault nested named entity identification method based on machine reading understanding |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532303A (en) * | 2019-09-04 | 2019-12-03 | 重庆交通大学 | A kind of information retrieval and the potential relationship method of excavation for Bridge Management & Maintenance information |
CN111091000A (en) * | 2019-12-24 | 2020-05-01 | 深圳视界信息技术有限公司 | Processing system and method for extracting user fine-grained typical opinion data |
CN111160031A (en) * | 2019-12-13 | 2020-05-15 | 华南理工大学 | Social media named entity identification method based on affix perception |
CN111178074A (en) * | 2019-12-12 | 2020-05-19 | 天津大学 | Deep learning-based Chinese named entity recognition method |
US20200342056A1 (en) * | 2019-04-26 | 2020-10-29 | Tencent America LLC | Method and apparatus for natural language processing of medical text in chinese |
CN112101027A (en) * | 2020-07-24 | 2020-12-18 | 昆明理工大学 | Chinese named entity recognition method based on reading understanding |
-
2021
- 2021-04-01 CN CN202110357215.9A patent/CN113033206B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200342056A1 (en) * | 2019-04-26 | 2020-10-29 | Tencent America LLC | Method and apparatus for natural language processing of medical text in chinese |
CN110532303A (en) * | 2019-09-04 | 2019-12-03 | 重庆交通大学 | A kind of information retrieval and the potential relationship method of excavation for Bridge Management & Maintenance information |
CN111178074A (en) * | 2019-12-12 | 2020-05-19 | 天津大学 | Deep learning-based Chinese named entity recognition method |
CN111160031A (en) * | 2019-12-13 | 2020-05-15 | 华南理工大学 | Social media named entity identification method based on affix perception |
CN111091000A (en) * | 2019-12-24 | 2020-05-01 | 深圳视界信息技术有限公司 | Processing system and method for extracting user fine-grained typical opinion data |
CN112101027A (en) * | 2020-07-24 | 2020-12-18 | 昆明理工大学 | Chinese named entity recognition method based on reading understanding |
Non-Patent Citations (2)
Title |
---|
CHEN GONG: "Hierarchical LSTM with char-subword-wird tree-structure respresentation for Chinese named entity recognition", 《SCIENCE CHINA》 * |
张海楠 等: "基于深度神经网络的中文命名实体识别", 《中文信息学报》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113935324A (en) * | 2021-09-13 | 2022-01-14 | 昆明理工大学 | Cross-border national culture entity identification method and device based on word set feature weighting |
CN113935324B (en) * | 2021-09-13 | 2022-10-28 | 昆明理工大学 | Cross-border national culture entity identification method and device based on word set feature weighting |
CN115879474A (en) * | 2023-02-14 | 2023-03-31 | 华东交通大学 | Fault nested named entity identification method based on machine reading understanding |
Also Published As
Publication number | Publication date |
---|---|
CN113033206B (en) | 2022-04-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
CN111209401A (en) | System and method for classifying and processing sentiment polarity of online public opinion text information | |
CN114064918B (en) | Multi-modal event knowledge graph construction method | |
CN111737496A (en) | Power equipment fault knowledge map construction method | |
CN113312501A (en) | Construction method and device of safety knowledge self-service query system based on knowledge graph | |
CN113033206B (en) | Bridge detection field text entity identification method based on machine reading understanding | |
CN113392209B (en) | Text clustering method based on artificial intelligence, related equipment and storage medium | |
CN111143553B (en) | Method and system for identifying specific information of real-time text data stream | |
CN113190656B (en) | Chinese named entity extraction method based on multi-annotation frame and fusion features | |
CN113051929A (en) | Entity relationship extraction method based on fine-grained semantic information enhancement | |
CN115599902B (en) | Oil-gas encyclopedia question-answering method and system based on knowledge graph | |
CN111309918A (en) | Multi-label text classification method based on label relevance | |
CN114491024B (en) | Specific field multi-label text classification method based on small sample | |
CN114153973A (en) | Mongolian multi-mode emotion analysis method based on T-M BERT pre-training model | |
CN115759119B (en) | Financial text emotion analysis method, system, medium and equipment | |
CN115827819A (en) | Intelligent question and answer processing method and device, electronic equipment and storage medium | |
CN111460097B (en) | TPN-based small sample text classification method | |
CN116756303A (en) | Automatic generation method and system for multi-topic text abstract | |
CN115017879A (en) | Text comparison method, computer device and computer storage medium | |
CN114398900A (en) | Long text semantic similarity calculation method based on RoBERTA model | |
CN113505222A (en) | Government affair text classification method and system based on text circulation neural network | |
CN116522165A (en) | Public opinion text matching system and method based on twin structure | |
Hua et al. | A character-level method for text classification | |
CN113792144B (en) | Text classification method of graph convolution neural network based on semi-supervision | |
CN115795060A (en) | Entity alignment method based on knowledge enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |