CN110889276A - Method, system and computer medium for extracting pointer-type extraction triple information by complex fusion features - Google Patents

Method, system and computer medium for extracting pointer-type extraction triple information by complex fusion features Download PDF

Info

Publication number
CN110889276A
CN110889276A CN201911083955.7A CN201911083955A CN110889276A CN 110889276 A CN110889276 A CN 110889276A CN 201911083955 A CN201911083955 A CN 201911083955A CN 110889276 A CN110889276 A CN 110889276A
Authority
CN
China
Prior art keywords
vector
extracting
extraction
pointer
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911083955.7A
Other languages
Chinese (zh)
Other versions
CN110889276B (en
Inventor
杨家兵
高怀恩
张学习
龙土志
董海涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201911083955.7A priority Critical patent/CN110889276B/en
Publication of CN110889276A publication Critical patent/CN110889276A/en
Application granted granted Critical
Publication of CN110889276B publication Critical patent/CN110889276B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a device and computer equipment for extracting pointer-type extraction triple information by complex fusion characteristics, which comprises the following steps: s1: acquiring a text and a corresponding triple SPO label; s2: training to obtain a word vector of each word; s3: inputting each character in the text into a network according to a character vector to train and complete feature extraction; s4: inputting the extracted features into a pointer model for training; s5: and extracting the triple SPO by using the trained model. The invention provides a brand-new model for extracting triples in a text, which adopts complex fused feature vectors, trains a pointer network model according to a subject S and an object P pointer in sequence, and then extracts all triples in a target by using the trained model.

Description

Method, system and computer medium for extracting pointer-type extraction triple information by complex fusion features
Technical Field
The present invention relates to the field of text feature extraction and information extraction, and more particularly, to a method, a system, and a computer medium for pointer extraction of triple information by using complex fusion feature extraction.
Background
In order to meet the challenges of information explosion, some automated tools are urgently needed to help people to quickly find really needed information from massive information sources, wherein all massive information consists of each sentence, and each sentence consists of a plurality of 'subject-predicate-object (subject S, object O and relationship P between the subject S and the object O)' triples. One hundred degree encyclopedia is found at random: the XX technology company Limited is a civil-camp communication technology company which produces and sells communication equipment, and is formally registered in 1987, and the headquarters are located in the Dragon sentry region of Shenzhen city in Guangdong province in China. In this sentence, all triples are { S: "XX technology Co., Ltd", O: "1987", P: "formation time" } and { S: "XX technology Co., Ltd", O: "Dragon sentry region of Guandong Shenzhen city", P: "headquarters point". How to extract the key information of the online text efficiently and accurately is a great challenge in the field. In most of the existing deep learning methods, one type is combined extraction, a sentence is input, a combined model is extracted through entity identification and relation, the original relation extraction related to a sequence marking task and a classification task is completely changed into a sequence marking problem through the combined model, and then a triple is directly obtained through an end-to-end neural network model. The other method is a two-step method, a sentence is input, named entity recognition is firstly carried out, then the recognized subject S and the recognized object O are combined pairwise, system p extraction is carried out, the relation p classification corresponding to the (S, O) combination is obtained, finally the triples with the entity relation are used as input, and finally all the triples are stored.
Disclosure of Invention
Aiming at the problems that in the prior art, all triples cannot be extracted by deep learning, entity relationship overlapping is not supported according to a sequence labeling strategy, and a two-step extraction method cannot effectively extract the forms of 'one S, a plurality of P, O' and the same pair of S, O 'which can also correspond to a plurality of P', the invention provides a method for extracting pointer type extraction triplet information by using complex fusion features.
A method for extracting pointer-type extraction triple information by complex fusion features comprises the following steps:
s1: obtaining sentences and corresponding triple tags from various texts, wherein the triple tags are a subject S, an object O and a relation P;
s2: coding each sentence into a vector format, and obtaining a word vector of each word through word position Embedding layer training;
s3: inputting each character in the sentence into a feature extraction network according to the character vector to complete feature extraction after training, and obtaining the feature vector of each sentence;
s4: inputting the feature vector of each sentence into a pointer model for training;
s5: extracting all S main bodies in the target by using the trained model; extracting all corresponding relations P according to the guidance of all the main bodies S; and guiding and extracting all O objects according to all combinations of (S, P), wherein the extracted objects and the labels have one-to-one correspondence.
In a preferred embodiment, in step S1, sentences and corresponding triples are obtained through web crawlers and manual annotations, respectively.
In a preferred embodiment, the specific steps at S2 are as follows:
s21, coding each character in all sentences without numbering corresponding to different characters;
s22, determining a fixed sequence length X, and cutting the sentence length to be 100 when the sentence length exceeds 100; if the length of the sentence is less than 100, 0 is supplemented after the sentence until the length of the sentence is 100, and a sentence vector recognized by a computer is formed;
and S23, placing the sentence vectors obtained in the step S22 in a word position coding layer Embedding layer for coding to obtain word vectors of the word position codes.
In a preferred embodiment, the specific steps of S3 are as follows:
s31, sending the word vectors of the obtained word position codes into a feature extraction network for training, wherein the feature extraction network comprises a convolution network and a circulation network;
s32, extracting sentence characteristic vector A ═ a from the convolution network1,a2,…,ai](ii) a The sentence characteristic vector extracted by the loop network is B ═ B1,b2,…,bi]Wherein a and b are single word vectors;
s33, rewriting A and B into complex number mode, wherein
Figure BDA0002264815930000021
n is a1、b1The number of elements in the vector;
s34, A^And B^Performing complex addition, determining the magnitude of the modulus, if the magnitude of the modulus is larger than that of the matrix
Figure BDA0002264815930000022
Or
Figure BDA0002264815930000023
The modulus values of (a) are fused, i.e. a is selected1+b1(ii) a Otherwise, select A^And B^The one with larger median modulus is used as the final feature extraction vector, namely, the vector is extracted from aiAnd biSelecting one;
s35, obtaining a final fused feature vector H ═ H1,h2,h3…hi]。
In a preferred embodiment, the specific steps at S4 are as follows:
s41, the fused feature vector H is [ H ═ H1,h2,h3…hi]Each h is a vector per word;
s42, calculating a score by using the current state as a current unit Attention _1 score:
Figure BDA0002264815930000031
where Va and Wa are parameters, e represents the fraction score of the current feature vector, score, Va and WahiThe similarity calculation is carried out to obtain the e,
Figure BDA0002264815930000032
the method is characterized in that a training parameter vector Va is transposed, a common transpose for similarity calculation between vectors is used, and a weight A is obtained in a normalization mode;
s43, obtaining Wa dimension dxd, hi dimension dx1 and Va dimension dx1 through training, obtaining a final vector C which is AxH through Attention _1, and taking a result C of weighted summation of an original vector H as an Attention value; by representing the text by the attention value vector, the network can learn the attention on the corresponding predicted SOP according to the label information. In the prediction S, the main position vector is actually two classes, and in the two-class problem, the set of values may be {0, 1}, so the loss function still uses the cross entropy of the two classes.
S44, using a 2 classifier to classify the attention vector Hi=[h1,h2,h3…hi]And extracting corresponding SPO triple information, training the model by using an Adam optimizer, training by using a smaller learning rate according to the test, then loading the optimal result of the training, and continuing to train to the optimal state by using a smaller learning rate.
In a preferred embodiment, the pointer model includes an Attention1 model, an Attention2 model, and an Attention3 model.
In a preferred embodiment, the specific steps of S5 are as follows:
s51, extracting all S main bodies in the target by using the trained model, sampling one of the S1 main bodies, and processing the fused feature vectors and the corresponding extracted feature vectors by using an Attention1 model;
s52, combining the fused feature vectors through the position of S1, extracting P by using an attention2 model, and obtaining a relation P vector through a softmax layer;
s53, combining different (S, P) combinations, combining the fused characteristic vector with the predicted S and P vectors to form a new vector, and sequentially adding a sigmoid layer to an Attention _3 model to predict the position of the corresponding O, wherein the Attention _2 model has the same structure as the Attention _1 and the Attention _2 models, only the training parameters are different, the Attention is to the weight (corresponding extraction information) of the position of the object O, and the sigmoid layer has the same principle as the prediction S.
And S54, outputting the corresponding SPO triple information.
The invention discloses a system for extracting pointer-type extraction triple information by using complex fusion characteristics, which is characterized in that a web crawler is used for acquiring original data for extracting all triple information of a target object from a corpus, and after training, a first extraction module is used for extracting a main body S by using an extraction model; a second extraction module for extracting the relation P using an extraction model; and the third extraction module is used for extracting the object O by using the extraction model.
In a third aspect of the present invention, a computer-readable storage medium includes a program of a complex fused feature extraction pointer-type triple information extraction method of a machine, and when the complex fused feature extraction pointer-type triple information extraction method is executed by a processor, the steps of the complex fused feature extraction pointer-type triple information extraction method are implemented.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a tree structure visualization method based on node introductivity change, and provides a brand-new model for extracting triples in a text, a pointer network model is trained according to a subject S and an object P pointer after a complex number is adopted to fuse feature vectors, and then all triples in a target are extracted by the trained model.
Drawings
Fig. 1 is a general flow chart of a method for extracting pointer-type extraction triple information by using complex fusion features according to the present invention;
fig. 2 is a schematic diagram of processing word vectors in step S2 in embodiment 2;
FIG. 3 is a schematic flowchart of step S3 in example 2;
FIG. 4 is a schematic flowchart of step S5 in example 2;
FIG. 5 is an expanded view of the Attention1 model in step S5 in example 2;
fig. 6 is a schematic block diagram of a system for extracting pointer-type extraction triple information by using complex fusion features provided in embodiment 2.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and are used for illustration only, and should not be construed as limiting the patent. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
A method for extracting pointer-type extraction triple information by using complex fusion features, as shown in fig. 1, includes the following steps:
a method for extracting pointer-type extraction triple information by complex fusion features comprises the following steps:
s1: obtaining sentences and corresponding triple tags from various texts, wherein the triple tags are a subject S, an object O and a relation P;
s2: coding each sentence into a vector format, and obtaining a word vector of each word through word position Embedding layer training;
s3: inputting each character in the sentence into a feature extraction network according to the character vector to complete feature extraction after training, and obtaining the feature vector of each sentence;
s4: inputting the feature vector of each sentence into a pointer model for training;
s5: extracting all S main bodies in the target by using the trained model; extracting all corresponding relations P according to the guidance of all the main bodies S; and guiding and extracting all O objects according to all combinations of (S, P), wherein the extracted objects and the labels have one-to-one correspondence.
Example two
A method for extracting pointer-type extraction triple information by complex fusion features comprises the following steps:
a method for extracting pointer-type extraction triple information by complex fusion features comprises the following steps:
s1: obtaining sentences and corresponding triple tags from various texts, wherein the triple tags are a subject S, an object O and a relation P;
s2: coding each sentence into a vector format, and obtaining a word vector of each word through word position Embedding layer training;
s3: inputting each character in the sentence into a feature extraction network according to the character vector to complete feature extraction after training, and obtaining the feature vector of each sentence;
s4: inputting the feature vector of each sentence into a pointer model for training;
s5: extracting all S main bodies in the target by using the trained model; extracting all corresponding relations P according to the guidance of all the main bodies S; and guiding and extracting all O objects according to all combinations of (S, P), wherein the extracted objects and the labels have one-to-one correspondence.
In a preferred embodiment, in step S1, sentences and corresponding triples are obtained through web crawlers and manual annotations, respectively.
In a preferred embodiment, as shown in fig. 2, the specific steps at S2 are as follows:
s21, coding each character in all sentences without numbering corresponding to different characters;
s22, determining a fixed sequence length X, and cutting the sentence length to be 100 when the sentence length exceeds 100; if the length of the sentence is less than 100, 0 is supplemented after the sentence until the length of the sentence is 100, and a sentence vector recognized by a computer is formed;
and S23, placing the sentence vectors obtained in the step S22 in a word position coding layer Embedding layer for coding to obtain word vectors of the word position codes.
In a preferred embodiment, as shown in fig. 3, the specific steps of S3 are as follows:
s31, sending the word vectors of the obtained word position codes into a feature extraction network for training, wherein the feature extraction network comprises a convolution network and a circulation network;
s32, extracting sentence characteristic vector A ═ a from the convolution network1,a2,…,ai](ii) a The sentence characteristic vector extracted by the loop network is B ═ B1,b2,…,bi]Wherein a and b are single word vectors;
s33, rewriting A and B into complex number mode, wherein
Figure BDA0002264815930000061
n is a1、b1The number of elements in the vector;
s34, A^And B^Performing complex addition, determining the magnitude of the modulus, if the magnitude of the modulus is larger than that of the matrix
Figure BDA0002264815930000062
Or
Figure BDA0002264815930000063
The modulus values of (a) are fused, i.e. a is selected1+b1(ii) a Otherwise, select A^And B^The one with larger median modulus is used as the final feature extraction vector, namely, the vector is extracted from aiAnd biSelecting one;
s35, obtaining a final fused feature vector H ═ H1,h2,h3…hi]。
In a preferred embodiment, the specific steps at S4 are as follows:
s41, the fused feature vector H is [ H ═ H1,h2,h3…hi]Each h is a vector per word;
s42, calculating a score by using the current state as a current unit Attention _1 score:
Figure BDA0002264815930000064
where Va and Wa are parameters, e represents the fraction score of the current feature vector, score, Va and WahiThe similarity calculation is carried out to obtain the e,
Figure BDA0002264815930000065
the method is characterized in that a training parameter vector Va is transposed, a common transpose for similarity calculation between vectors is used, and a weight A is obtained in a normalization mode;
s43, obtaining Wa dimension dxd, hi dimension dx1 and Va dimension dx1 through training, obtaining a final vector C which is AxH through Attention _1, and taking a result C of weighted summation of an original vector H as an Attention value; by representing the text by the attention value vector, the network can learn the attention on the corresponding predicted SOP according to the label information. In the prediction S, the main position vector is actually two classes, and in the two-class problem, the set of values may be {0, 1}, so the loss function still uses the cross entropy of the two classes.
S44, using a 2 classifier to classify the attention vector Hi=[h1,h2,h3…hi]And extracting corresponding SPO triple information, training the model by using an Adam optimizer, training by using a smaller learning rate according to the test, then loading the optimal result of the training, and continuing to train to the optimal state by using a smaller learning rate.
The Attention _1 model is developed as shown in fig. 5, and an Attention network processes the fused feature vectors and the correspondingly extracted feature vectors to generate final brand new sentence feature vectors. And finally, respectively predicting the starting position of the first word of the S main body and the ending position of the last word by two sigmoid layers. Such as [1,0,0,0,0, 0] and [0,0,0,0, 1,0,0,0,0,0 ].
In a preferred embodiment, as shown in fig. 5, the specific steps of S5 are as follows:
s51, extracting all S main bodies in the target by using the trained model, sampling one of the S1 main bodies, and processing the fused feature vectors and the corresponding extracted feature vectors by using an Attention1 model;
s52, combining the fused feature vectors through the position of S1, extracting P by using an attention2 model, and obtaining a relation P vector through a softmax layer;
s53, combining different (S, P) combinations, combining the fused characteristic vector with the predicted S and P vectors to form a new vector, and sequentially adding a sigmoid layer to an Attention _3 model to predict the position of the corresponding O, wherein the Attention _2 model has the same structure as the Attention _1 and the Attention _2 models, only the training parameters are different, the Attention is to the weight (corresponding extraction information) of the position of the object O, and the sigmoid layer has the same principle as the prediction S.
And S54, outputting the corresponding SPO triple information.
The invention discloses a system for extracting pointer-type extraction triple information by using complex fusion features, which is characterized in that as shown in fig. 6, original data are obtained by using a web crawler and are used for extracting all triple information of a target object from a corpus, and after training, a first extraction module is used for extracting a main body S by using an extraction model; a second extraction module for extracting the relation P using an extraction model; and the third extraction module is used for extracting the object O by using the extraction model.
In a third aspect of the present invention, a computer-readable storage medium includes a program of a complex fused feature extraction pointer-type triple information extraction method of a machine, and when the complex fused feature extraction pointer-type triple information extraction method is executed by a processor, the steps of the complex fused feature extraction pointer-type triple information extraction method are implemented.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (9)

1. A method for extracting pointer-type extraction triple information by complex fusion features is characterized by comprising the following steps:
s1: obtaining sentences and corresponding triple tags from various texts, wherein the triple tags are a subject S, an object O and a relation P;
s2: coding each sentence into a vector format, and obtaining a word vector of each word through word position Embedding layer training;
s3: inputting each character in the sentence into a feature extraction network according to the character vector to complete feature extraction after training, and obtaining the feature vector of each sentence;
s4: inputting the feature vector of each sentence into a pointer model for training;
s5: extracting all S main bodies in the target by using the trained model; extracting all corresponding relations P according to the guidance of all the main bodies S; and extracting all O objects according to all the combination indexes of (S, P).
2. The method for extracting pointer-type extraction triplet information according to claim 1, wherein in step S1, sentences and corresponding triples are obtained through web crawlers and manual annotations respectively.
3. The method for extracting pointer-type extraction triplet information from complex number fusion features as claimed in claim 2, wherein the specific steps at S2 are as follows:
s21, coding each character in all sentences, wherein different numbers correspond to different characters;
s22, determining a fixed sequence length X, and cutting the sentence length to be X when the sentence length exceeds X; if the length of the sentence is less than X, 0 is supplemented after the sentence until the length of the sentence is X, and a sentence vector recognized by a computer is formed;
and S23, placing the sentence vectors obtained in the step S22 in a word position coding layer Embedding layer for coding to obtain word vectors of the word position codes.
4. The method for extracting pointer-type extraction triplet information from complex number fusion features as claimed in claim 3, wherein the specific steps of S3 are as follows:
s31, sending the word vectors of the obtained word position codes into a feature extraction network for training, wherein the feature extraction network comprises a convolution network and a circulation network;
s32, extracting sentence characteristic vector A ═ a from the convolution network1,a2,..,ai](ii) a The sentence characteristic vector extracted by the loop network is B ═ B1,b2,..,bi]Wherein a and b are single word vectors;
s33, rewriting A and B into complex number mode, wherein
Figure FDA0002264815920000021
n is a1、b1The number of elements in the vector;
s34, carrying out complex addition on A and B, judging the size of the module, if the size of the module is larger than that of the module at the same time
Figure FDA0002264815920000022
Or
Figure FDA0002264815920000023
The modulus values of (a) are fused, hiBoth select ai+bi(ii) a On the contrary, the one with larger modulus value in A and B is selected as the final feature extraction vector, hiFrom aiAnd biSelecting one;
s35, obtaining a final fused feature vector H ═ H1,h2,h3…hi]。
5. The method of claim 1, wherein the pointer models include an Attention1 model, an Attention2 model, and an Attention3 model.
6. The method for extracting pointer-type extraction triplet information from complex number fusion features as claimed in claim 5, wherein the specific steps at S4 are as follows:
s41, fusing the feature vectors Hi=[h1,h2,h3…hi]Each h is a vector per word;
s42, calculating a score by using the current state as a current unit Attention _1 score:
Figure FDA0002264815920000024
A=softmax(Va Ttanh(WaHi) Where Va and Wa are parameters, e represents the fraction score of the current feature vector, Va and WahiCalculating the similarity to obtain e, Va TThe method is characterized in that a training parameter vector Va is transposed, a common transpose for similarity calculation between vectors is used, and a weight A is obtained in a normalization mode;
s43, obtaining Wa dimension dxd, hi dimension dx1 and Va dimension dx1 through training, obtaining a final vector C which is AxH through Attention _1, and taking a result C of weighted summation of an original vector H as an Attention value; the attention value vector is used for representing the text, and the network can learn the attention on the corresponding predicted SOP according to the label information;
s44, using a 2 classifier to classify the attention vector Hi=[h1,h2,h3…hi]And extracting corresponding SPO triple information, training the model by using an Adam optimizer, training by using a smaller learning rate according to the test, then loading the optimal result of the training, and continuing to train to the optimal state by using a smaller learning rate.
7. The method for extracting pointer-type extraction triplet information from complex number fusion features as claimed in claim 4, wherein the specific steps of S5 are as follows:
s51, extracting all S main bodies in the target by using the trained model, sampling one of the S1 main bodies, and processing the fused feature vectors and the corresponding extracted feature vectors by using an Attention1 model;
s52, combining the fused feature vectors through the position of the main body S1, extracting P by using an attention2 model and obtaining a relation P vector through a softmax layer;
s53, combining different (S, P) combinations, the fused characteristic vector and the predicted S, P vector to form a new vector, and sequentially predicting the position of the corresponding O through an Attention _3 model and a sigmoid layer;
and S54, outputting the corresponding SPO triple information.
8. A system for extracting pointer-type extraction triple information by using complex fusion features is characterized in that original data are obtained by using a web crawler and used for extracting all triple information of a target object from a corpus, and after training, a first extraction module is used for extracting a main body S by using an extraction model; a second extraction module for extracting the relation P using an extraction model; and the third extraction module is used for extracting the object O by using the extraction model.
9. A computer-readable storage medium, comprising a program of a complex fused feature extraction pointer-type triple information extracting method of a machine, wherein when the program is executed by a processor, the method implements the steps of the complex fused feature extraction pointer-type triple information extracting method as claimed in any one of claims 1 to 7.
CN201911083955.7A 2019-11-07 2019-11-07 Method, system and computer medium for extracting pointer type extraction triplet information by complex fusion characteristics Active CN110889276B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911083955.7A CN110889276B (en) 2019-11-07 2019-11-07 Method, system and computer medium for extracting pointer type extraction triplet information by complex fusion characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911083955.7A CN110889276B (en) 2019-11-07 2019-11-07 Method, system and computer medium for extracting pointer type extraction triplet information by complex fusion characteristics

Publications (2)

Publication Number Publication Date
CN110889276A true CN110889276A (en) 2020-03-17
CN110889276B CN110889276B (en) 2023-04-25

Family

ID=69747071

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911083955.7A Active CN110889276B (en) 2019-11-07 2019-11-07 Method, system and computer medium for extracting pointer type extraction triplet information by complex fusion characteristics

Country Status (1)

Country Link
CN (1) CN110889276B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859922A (en) * 2020-07-31 2020-10-30 上海银行股份有限公司 Application method of entity relation extraction technology in bank wind control
CN113051922A (en) * 2021-04-20 2021-06-29 北京工商大学 Triple extraction method and system based on deep learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2668306A1 (en) * 2009-06-08 2010-12-08 Stephen R. Germann Method and system for applying metadata to data sets of file objects
CN109902145A (en) * 2019-01-18 2019-06-18 中国科学院信息工程研究所 A kind of entity relationship joint abstracting method and system based on attention mechanism

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2668306A1 (en) * 2009-06-08 2010-12-08 Stephen R. Germann Method and system for applying metadata to data sets of file objects
CN109902145A (en) * 2019-01-18 2019-06-18 中国科学院信息工程研究所 A kind of entity relationship joint abstracting method and system based on attention mechanism

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859922A (en) * 2020-07-31 2020-10-30 上海银行股份有限公司 Application method of entity relation extraction technology in bank wind control
CN111859922B (en) * 2020-07-31 2023-12-01 上海银行股份有限公司 Application method of entity relation extraction technology in bank wind control
CN113051922A (en) * 2021-04-20 2021-06-29 北京工商大学 Triple extraction method and system based on deep learning

Also Published As

Publication number Publication date
CN110889276B (en) 2023-04-25

Similar Documents

Publication Publication Date Title
CN110619123B (en) Machine reading understanding method
CN107729309B (en) Deep learning-based Chinese semantic analysis method and device
CN111694924A (en) Event extraction method and system
CN110851596A (en) Text classification method and device and computer readable storage medium
CN111737511B (en) Image description method based on self-adaptive local concept embedding
CN110796160A (en) Text classification method, device and storage medium
CN113836866B (en) Text encoding method, text encoding device, computer readable medium and electronic equipment
CN112800225B (en) Microblog comment emotion classification method and system
CN109933792A (en) Viewpoint type problem based on multi-layer biaxially oriented LSTM and verifying model reads understanding method
CN111914553B (en) Financial information negative main body judging method based on machine learning
CN112559734A (en) Presentation generation method and device, electronic equipment and computer readable storage medium
CN114841151B (en) Medical text entity relation joint extraction method based on decomposition-recombination strategy
CN110889276A (en) Method, system and computer medium for extracting pointer-type extraction triple information by complex fusion features
CN114239574A (en) Miner violation knowledge extraction method based on entity and relationship joint learning
CN114564563A (en) End-to-end entity relationship joint extraction method and system based on relationship decomposition
CN114065702A (en) Event detection method fusing entity relationship and event element
CN116932661A (en) Event knowledge graph construction method oriented to network security
CN116258137A (en) Text error correction method, device, equipment and storage medium
CN117271759A (en) Text abstract generation model training method, text abstract generation method and device
CN116663566A (en) Aspect-level emotion analysis method and system based on commodity evaluation
CN109858031A (en) Neural network model training, context-prediction method and device
CN113204975A (en) Sensitive character wind identification method based on remote supervision
CN112016493A (en) Image description method and device, electronic equipment and storage medium
US11869130B2 (en) Generating visual feedback
CN112905750A (en) Generation method and device of optimization model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant