CN111008529A - Chinese relation extraction method based on neural network - Google Patents

Chinese relation extraction method based on neural network Download PDF

Info

Publication number
CN111008529A
CN111008529A CN201910669521.9A CN201910669521A CN111008529A CN 111008529 A CN111008529 A CN 111008529A CN 201910669521 A CN201910669521 A CN 201910669521A CN 111008529 A CN111008529 A CN 111008529A
Authority
CN
China
Prior art keywords
model
neural network
information
chinese
relation extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910669521.9A
Other languages
Chinese (zh)
Other versions
CN111008529B (en
Inventor
王凯
秦永彬
李婷
陈艳平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Luhao Technology Co.,Ltd.
Original Assignee
Guizhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou University filed Critical Guizhou University
Priority to CN201910669521.9A priority Critical patent/CN111008529B/en
Publication of CN111008529A publication Critical patent/CN111008529A/en
Application granted granted Critical
Publication of CN111008529B publication Critical patent/CN111008529B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a Chinese relation extraction method based on a neural network, which can effectively acquire the structural information and semantic information of sentences. In the relation extraction task, a single long-short term memory model can only learn the features of a certain dimensionality, and the convolutional neural network can learn the features of different dimensionalities by utilizing a plurality of convolutional kernels. Based on the two characteristics, the invention provides a multilayer bidirectional long-short term memory-attention model, and the method can automatically extract abstract features with different dimensions and dependent information from the original input by setting hidden layers with different sizes for the long-short term memory model, and captures global information by using an attention mechanism. Experiments show that compared with a multi-core convolutional neural network and a single long-short term memory-attention model, the method disclosed by the invention can obviously improve the Chinese relation extraction effect, and obtains 71.61% of F value on an ACE RDC2005 Chinese data set, so that a good effect is obtained, and the effectiveness of the method is proved.

Description

Chinese relation extraction method based on neural network
Technical Field
The invention relates to the field of information extraction, in particular to a Chinese relation extraction method based on a neural network. Belongs to the technical field of natural language processing and machine learning.
Background
With the development of artificial intelligence and technical explosion in the field of information extraction, entity relationship extraction has received attention from more and more scholars as an important research topic in the field of information extraction. The main purpose is to extract semantic relations between labeled entity pairs in sentences, namely to determine relation categories between entity pairs in unstructured texts on the basis of entity recognition and form structured data for storage and retrieval. The result of the entity relation extraction can be used for constructing a knowledge map or an ontology knowledge base, and data support can be provided for the construction of an automatic question-answering system. Besides, the entity relationship extraction has important research significance in the aspects of semantic network annotation, chapter understanding and machine translation.
Early relation extraction was mainly based on grammatical rules, by analyzing grammatical structures in sentences as the basis for relation generation. Although the method achieves good results, the recall rate is difficult to promote due to strict rules, professional grammatical knowledge and literature foundation are needed, and the applicability is not high. With the continuous development of the technology, the relation extraction method is divided into a supervised method, a semi-supervised method and an unsupervised method. Based on the content related to the invention, a supervised relationship extraction method is mainly researched. Supervised relational extraction can be mostly regarded as a classification problem, and there are two main methods in summary: a shallow structure model and a deep learning model.
The shallow structure generally has only one layer or no hidden layer nodes, such as a support vector machine, maximum entropy and the like. Shallow structures in relational extraction often use methods of feature engineering or kernel functions. The traditional method based on feature engineering mainly relies on a skillful design of feature set output through a language processing process. Most of these methods rely on either a large number of manually designed features or on carefully designed kernel functions. Despite the assistance of many excellent NLP tools, there is still a risk of performance degradation due to errors such as word segmentation inaccuracy and syntax parsing error. More importantly, the low portability of these carefully designed features or kernel functions greatly affects their scalability.
In recent years, a great deal of research has been made on relationship extraction based on deep learning. The method for extracting the relationships is based on models such as CNN and RNN, and achieves excellent effects. Many neural network-based methods show the advantages of neural networks over traditional shallow structures, but most of these results are achieved on a distributed balanced english dataset and use many external features as aids. The Chinese grammar structure is complex, and the language fuzzy phenomenon is more serious.
Disclosure of Invention
The invention provides a Chinese relation extraction method based on a neural network. The method can automatically extract abstract features with different dimensionalities and dependent information from original input by arranging hidden layers with different sizes for the long-term and short-term memory model, and captures global information by using an attention mechanism. Experiments show that compared with a multi-core convolutional neural network and a single long-short term memory-attention model, the method can obviously improve the Chinese relation extraction effect, and obtains a better result on the ACE RDC2005 Chinese data set, so that the effectiveness of the method is proved. The model frame is shown in figure 1.
The technical scheme of the invention is as follows: a Chinese relation extraction method based on a neural network comprises the following steps: constructing a BilSTMA unit, and extracting deep semantic information and global dependency information of a sentence; step two, constructing a Multi-BilSTMA model, and acquiring semantic information with dependency relationship of different granularities; and step three, verifying the validity of the method by using the real data.
The step 1 fully utilizes the advantages of a bidirectional long-term and short-term memory model (BilSTM) in the aspect of processing long-term dependence problems and the characteristic that an Attention mechanism (Attention) can capture global dependence information, and constructs a BilSTM unit (BilSTM-Attention) to extract deep semantic information and dependence information of sentences.
Step 2, setting hidden layers with different sizes in the BilSTA units, combining the BilSTA units with different sizes, and constructing a Multi-BilSTA model, wherein the model can obtain semantic information with dependency relationship with different granularities.
In the step 3, in order to verify the effectiveness of the method, an ACE RDC200 Chinese data set is used for verifying the identification effect of the method, so that the effectiveness of the method is verified.
Advantageous effects
The invention has the beneficial effects that:
in the invention, the key point is that the characteristics that the Multi-core CNN can learn different granularity characteristics are used for reference, a Multi-BilSTMA model is constructed by setting different sizes of BilSTMs by using a BilSTM and an Attention mechanism, and experiments prove that the method has excellent effect on an ACE RDC2005 Chinese data set.
The invention provides a Chinese relation extraction method based on a Multi-BilSTM-Attention neural network model. Experiments prove that the method shows higher performance on an ACE data set, and the effectiveness of the method is proved. The method provided by the invention effectively utilizes the characteristic that different granularity characteristics can be learned in the multi-core CNN neural network, and combines the characteristic with the BilSTM, thereby fully playing the characteristic of automatic characteristic extraction of the neural network model. A plurality of hidden layers with different sizes are arranged in a bidirectional BILSTM channel, so that feature sparsity can be prevented to a certain extent, semantic information of characters can be effectively acquired and utilized, and abstract features with different dimensions can be automatically acquired. On the basis, an Attention mechanism is added, local features and global features of sentences are utilized, and the weight is adjusted through the features, so that the noise is reduced, and the accuracy is improved.
The method provided by the invention combines the characteristic that a single long-short term memory model can only learn a certain specific dimensionality with the characteristic that a plurality of convolution kernels in a convolution neural network can learn different dimensionalities, provides a Multi-BilSTM-Attention model, obtains excellent performance in the aspect of Chinese relation extraction, and obtains good use effect.
Drawings
FIG. 1 is a neural network model of Multi-BilSTM-Attention according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings.
For a sentence with two entities, the task of relationship extraction is to extract candidate relationships between the two entities. The bidirectional long-short term memory neural network (BilSTM) model belongs to a variant of a Recurrent Neural Network (RNN), can effectively process long-distance information and avoid gradient explosion, and is combined to use due to the fact that the BilSTM and the Attention have good complementarity. However, a single, fixed BilSTM can only learn information of a particular dimension, and thus a Multi-BilSTM model is constructed by setting different BilSTMs. The model can learn features with dependent information in multiple dimensions.
First, the input layer of the model consists of word vectors mapped to the lookup table obtained by initialization. If the sentence length is L, the sentence mapped into the vector can be represented as: x ═ X1,x2,···,xL]Wherein x isi∈RDIs the ith word wiD is the dimension of the vector. If the dictionary size is V, the Embedding layer can be expressed as X ∈ RV×H. This process can be expressed as: x ═ embedding(s).
Next, the Multi-BiLSTMA layer in the present invention is composed of three BiLSTMA units. Wherein each BilSTMA unit consists of one layer of BilSTM and one layer of Attention. As shown in fig. 1(b), the BiLSTMA receives the data of the Embedding layer, uses a forward LSTM and a backward LSTM, to form a BiLSTM layer, which is used to extract features of deeper layers of the Embedding. This process is summarized as follows:
Figure RE-GDA0002381373120000031
Figure RE-GDA0002381373120000032
Figure RE-GDA0002381373120000033
representing an element-by-element addition. Information on each time step in the Attention layer is laminated and combined with the BilsTM layer, and a signal with large influence on the extraction result is obtained through calculationAnd (4) information. This process can be summarized as: a ═ extension (h).
The next step is the fully connected layer of the model. After the outputs of the three BilSTMA units are spliced together, the information learned by the model is classified through a layer of full connection (Dense) layer, wherein the size of the hidden layer is the number of the relational types, namely 7. This process is summarized as follows: d ═ sense (a).
And finally, in order to obtain a better experimental effect, performing normalization processing on the output result of the full connection layer by using a softmax layer to obtain a final classification result. In general, this process can be summarized as: y ═ softmax (d).
The effectiveness of the method is verified by adopting real data, the selected data is an ACE RDC2005 standard Chinese data set, and the data is firstly preprocessed.
The invention adopts a publicly issued ACE RDC2005 Chinese data set to perform relationship extraction. After screening out irregular documents, the experiment totally uses 628 documents. This data set contains 6 entity relationship types (collectively, the positive examples), which are: "PART-WHOLE", "PHYS", "ORG-AFF", "GEN-AFF", "PER-SOC", and "ART". Since the relationships in the data set are directional, for example: if the entity pair (A, B) has an "ART" relationship in the dataset, but there is no relationship type labeled by any dataset between the entity pair (B, A), all such cases are collectively called negative examples, and the relationship type is labeled as "Other". Since the relationship extraction is mainly performed at the sentence level, ". ","! ","? ",": "these 5 Chinese punctuation marks cut the text in the dataset into sentences. Sentences without entity pairs are discarded, and sentences repeated between positive examples and negative examples are removed (because the same sentence cannot be both positive examples and negative examples), so that 1010056 sentences are obtained in total, wherein the 1010056 sentences comprise 9244 positive example sentences and 91812 negative example sentences. ACE RDC2005 chinese dataset is a dataset with an unbalanced distribution, each relationship type is not evenly distributed, especially with negative cases up to 90.85% in percentage. In order to more closely approach the real situation and reduce the influence caused by a large amount of negative example data, only the positive example results are evaluated in the evaluation.
Secondly, on the aspect of word vector processing, a method of randomly initializing the Lookup Table is adopted, the Lookup Table can be continuously adjusted in the training process, and the dimension of the word vector is set to be 100 dimensions. Because the neural network needs fixed input, the average length of sentences corresponding to each relationship type is analyzed. In order to balance the extraction effect and the training cost, a sentence length equal to 50 is selected as the maximum input length, sentences with sentence lengths lower than 50 are filled with '0' to 50, and the cuts with sentence lengths higher than 50 are cut to 50. An AdaDelta function is selected as an optimization function, and the learning rate is 1.0 of the default of the optimization function. Further, the batch size was set to 50 and the number of iterations was 100. Experimentally, three BilSTA cells were selected, with hidden layers of sizes 100, 200, and 300, respectively.
Finally, three tasks were designed on the same data in order to demonstrate the effectiveness of the method of the invention. The first task is to use the multi-core CNN for relationship extraction, which can be regarded as a reference model; the second task is to use single-layer BilSTM to extract the relation, and experiments prove that the effect is superior to that of the simple multi-core CNN method through the combination of the BilSTM and the Attention; the third task is to use a Multi-BilSTMA model to carry out relationship extraction, prove that the model has the effect similar to the Multi-core CNN, can fully utilize the advantages of the BilSTM and the Attention, and obviously improve the experimental result compared with the former two.
After 5-fold cross-validation experiments, the properties are shown in table 1 (F values for the three models have been shown in bold).
TABLE 1 relationship extraction task Performance
Figure RE-GDA0002381373120000051
The number distribution of each relationship type is not balanced, and the results are also directly shown in table 1. The overall appearance is that the result of the high number of types is also high, which also accords with the characteristics of the neural network. In general, the larger the data volume, the more sufficient the training, the less likely the overfitting, and the better the result, for the same data quality and the same model. It can also be seen from the results that the F values of the three classes "PART-white", "ORG-AFF" and "GEN-AFF" are significantly higher than those of the other three normal types, which is also determined by the large data volume of the three classes.
Meanwhile, as can be seen from table 1, the performance of single-layer BiLSTMA is superior to that of simple multi-core CNN, because BiLSTMA can capture dependency information and key features in sentences more effectively than CNN, thereby obtaining better extraction effect. And the Multi-BilSTMA has the characteristics of both, so the performance of the Multi-BilSTMA is obviously superior to that of the two. In conclusion, the Chinese relation extraction method based on the neural network provided by the invention has excellent performance.
The present invention is not described in detail, but is known to those skilled in the art. Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.

Claims (4)

1. The Chinese relation extraction method based on the neural network is characterized by comprising the following steps of:
step 1: constructing a BilSTMA unit, and extracting deep semantic information and global dependency information of a sentence;
step 2: constructing a Multi-BilSTMA model, and acquiring semantic information with dependency relationship of different granularities;
and step 3: the validity of the method is verified using the real data.
2. The method of claim 1, wherein step 1 constructs the BilSTMA unit to extract deep semantic information and dependency information of the sentence by fully utilizing the advantages of the two-way long-short term memory model in processing the long-term dependency problem and the feature that the attention mechanism can capture global dependency information.
3. The method of claim 1, wherein step 2, combining the BilSTA units with different sizes by setting hidden layers with different sizes in the BilSTA units, constructs a Multi-BilSTA model, and the model can obtain semantic information with dependency relationship with different granularities.
4. The method of claim 1, wherein step 3, the ACE RDC200 chinese dataset is used to verify the recognition and thus validity of the method.
CN201910669521.9A 2019-07-24 2019-07-24 Chinese relation extraction method based on neural network Active CN111008529B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910669521.9A CN111008529B (en) 2019-07-24 2019-07-24 Chinese relation extraction method based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910669521.9A CN111008529B (en) 2019-07-24 2019-07-24 Chinese relation extraction method based on neural network

Publications (2)

Publication Number Publication Date
CN111008529A true CN111008529A (en) 2020-04-14
CN111008529B CN111008529B (en) 2023-07-21

Family

ID=70111470

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910669521.9A Active CN111008529B (en) 2019-07-24 2019-07-24 Chinese relation extraction method based on neural network

Country Status (1)

Country Link
CN (1) CN111008529B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111831783A (en) * 2020-07-07 2020-10-27 北京北大软件工程股份有限公司 Chapter-level relation extraction method
CN114647726A (en) * 2022-03-04 2022-06-21 贵州大学 News webpage information extraction method, system, equipment and medium based on multi-dimensional text features

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9799327B1 (en) * 2016-02-26 2017-10-24 Google Inc. Speech recognition with attention-based recurrent neural networks
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations
US20190080225A1 (en) * 2017-09-11 2019-03-14 Tata Consultancy Services Limited Bilstm-siamese network based classifier for identifying target class of queries and providing responses thereof
CN109710761A (en) * 2018-12-21 2019-05-03 中国标准化研究院 The sentiment analysis method of two-way LSTM model based on attention enhancing
CN109740148A (en) * 2018-12-16 2019-05-10 北京工业大学 A kind of text emotion analysis method of BiLSTM combination Attention mechanism
CN109858032A (en) * 2019-02-14 2019-06-07 程淑玉 Merge more granularity sentences interaction natural language inference model of Attention mechanism

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9799327B1 (en) * 2016-02-26 2017-10-24 Google Inc. Speech recognition with attention-based recurrent neural networks
US20190080225A1 (en) * 2017-09-11 2019-03-14 Tata Consultancy Services Limited Bilstm-siamese network based classifier for identifying target class of queries and providing responses thereof
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations
CN109740148A (en) * 2018-12-16 2019-05-10 北京工业大学 A kind of text emotion analysis method of BiLSTM combination Attention mechanism
CN109710761A (en) * 2018-12-21 2019-05-03 中国标准化研究院 The sentiment analysis method of two-way LSTM model based on attention enhancing
CN109858032A (en) * 2019-02-14 2019-06-07 程淑玉 Merge more granularity sentences interaction natural language inference model of Attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
程淑玉等: "融合Attention多粒度句子交互自然语言推理研究", 《小型微型计算机系统》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111831783A (en) * 2020-07-07 2020-10-27 北京北大软件工程股份有限公司 Chapter-level relation extraction method
CN111831783B (en) * 2020-07-07 2023-12-08 北京北大软件工程股份有限公司 Method for extracting chapter-level relation
CN114647726A (en) * 2022-03-04 2022-06-21 贵州大学 News webpage information extraction method, system, equipment and medium based on multi-dimensional text features
CN114647726B (en) * 2022-03-04 2024-08-06 贵州大学 News webpage information extraction method, system, equipment and medium based on multidimensional text features

Also Published As

Publication number Publication date
CN111008529B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
CN109635109B (en) Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism
CN107133213B (en) Method and system for automatically extracting text abstract based on algorithm
WO2023065544A1 (en) Intention classification method and apparatus, electronic device, and computer-readable storage medium
CN112487203B (en) Relation extraction system integrated with dynamic word vector
CN107291693B (en) Semantic calculation method for improved word vector model
CN109783817B (en) Text semantic similarity calculation model based on deep reinforcement learning
US10394956B2 (en) Methods, devices, and systems for constructing intelligent knowledge base
CN106372061B (en) Short text similarity calculation method based on semantics
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN106776562A (en) A kind of keyword extracting method and extraction system
CN110619051B (en) Question sentence classification method, device, electronic equipment and storage medium
CN111985228B (en) Text keyword extraction method, text keyword extraction device, computer equipment and storage medium
CN110347790B (en) Text duplicate checking method, device and equipment based on attention mechanism and storage medium
CN110765755A (en) Semantic similarity feature extraction method based on double selection gates
CN111680488A (en) Cross-language entity alignment method based on knowledge graph multi-view information
CN111475622A (en) Text classification method, device, terminal and storage medium
CN112232053A (en) Text similarity calculation system, method and storage medium based on multi-keyword pair matching
CN109918507B (en) textCNN (text-based network communication network) improved text classification method
CN114880461A (en) Chinese news text summarization method combining contrast learning and pre-training technology
CN111859979A (en) Ironic text collaborative recognition method, ironic text collaborative recognition device, ironic text collaborative recognition equipment and computer readable medium
CN110276396A (en) Picture based on object conspicuousness and cross-module state fusion feature describes generation method
CN113468854A (en) Multi-document automatic abstract generation method
CN111061873B (en) Multi-channel text classification method based on Attention mechanism
CN115422939A (en) Fine-grained commodity named entity identification method based on big data
CN111008529A (en) Chinese relation extraction method based on neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20241015

Address after: No. 001, Building 3, Computing Power Center, Guiyang Big Data Science and Technology Innovation City, Huchao Township, Gui'an New District, Guiyang City, Guizhou Province, 550000

Patentee after: Guizhou Luhao Technology Co.,Ltd.

Country or region after: China

Address before: Science and Technology Department of Huaxi north campus, Guizhou University, Huaxi District, Guiyang City, Guizhou Province

Patentee before: Guizhou University

Country or region before: China