WO2019085328A1 - 企业关系提取方法、装置及存储介质 - Google Patents

企业关系提取方法、装置及存储介质 Download PDF

Info

Publication number
WO2019085328A1
WO2019085328A1 PCT/CN2018/076119 CN2018076119W WO2019085328A1 WO 2019085328 A1 WO2019085328 A1 WO 2019085328A1 CN 2018076119 W CN2018076119 W CN 2018076119W WO 2019085328 A1 WO2019085328 A1 WO 2019085328A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
word
sentence
training
hidden layer
Prior art date
Application number
PCT/CN2018/076119
Other languages
English (en)
French (fr)
Inventor
徐冰
汪伟
罗傲雪
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019085328A1 publication Critical patent/WO2019085328A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities

Definitions

  • the present application relates to the field of data information processing technologies, and in particular, to a method, device, and computer readable storage medium for extracting enterprise relationships.
  • the present application provides an enterprise relationship extraction method, apparatus and computer readable storage medium, which can extend a relationship extraction model based on a convolutional neural network to remote monitoring data, thereby effectively reducing the model's dependence on manual annotation data.
  • this supervised approach to corporate relationship extraction has better accuracy and recall than semi-supervised or unsupervised methods.
  • the present application provides a method for extracting enterprise relationships, including:
  • the sample library establishing step extracting the existence entity of the enterprise entity from the knowledge base to establish a sample library as a training sample sentence;
  • Word segmentation step extract all the training sentences containing a pair of business entities from the sample library, use the default word segmentation tool to segment each training sample, and map each word after the word segmentation into the word vector x i , and Each training sentence is mapped into a sentence vector S i as an input to the first layer of the cyclic neural network model;
  • the first hidden layer state vector h i of the current word vector x i is calculated from left to right using the long and short term memory module, and the current word vector x i is calculated from right to left.
  • the second hidden layer state vector h i ' obtains the integrated hidden layer state vector of each word in the training sentence by splicing the two hidden layer state vectors, and then obtains each according to the integrated hidden layer state vector of all words in the training sentence.
  • Calculating step in the third layer of the cyclic neural network model, using the average vector expression to calculate the average vector S of each training sample according to the feature vector T i of each training sample;
  • Weight determining step in the last layer of the cyclic neural network model, the average vector S and the relationship type of the enterprise entity pair are substituted into the softmax classification function to calculate the weight a i of each training sample;
  • Predicting step extracting a sentence containing two business entities from the current text, obtaining a feature vector T i of the sentence through a bidirectional long-term and short-term memory module, and inputting the feature vector T i into the trained cyclic neural network model to predict the two The relationship between business entities.
  • the present application also provides an electronic device, including: a memory, a processor, and an enterprise relationship extraction program stored on the memory and operable on the processor, where the enterprise relationship extraction program is The processor executes, and the following steps can be implemented:
  • the sample library establishing step extracting the existence entity of the enterprise entity from the knowledge base to establish a sample library as a training sample sentence;
  • Word segmentation step extract all the training sentences containing a pair of business entities from the sample library, use the default word segmentation tool to segment each training sample, and map each word after the word segmentation into the word vector x i , and Each training sentence is mapped into a sentence vector S i as an input to the first layer of the cyclic neural network model;
  • the first hidden layer state vector h i of the current word vector x i is calculated from left to right using the long and short term memory module, and the current word vector x i is calculated from right to left.
  • the second hidden layer state vector h i ' obtains the integrated hidden layer state vector of each word in the training sentence by splicing the two hidden layer state vectors, and then obtains each according to the integrated hidden layer state vector of all words in the training sentence.
  • Calculating step in the third layer of the cyclic neural network model, using the average vector expression to calculate the average vector S of each training sample according to the feature vector T i of each training sample;
  • Weight determining step in the last layer of the cyclic neural network model, the average vector S and the relationship type of the enterprise entity pair are substituted into the softmax classification function to calculate the weight a i of each training sample;
  • Predicting step extracting a sentence containing two business entities from the current text, obtaining a feature vector T i of the sentence through a bidirectional long-term and short-term memory module, and inputting the feature vector T i into the trained cyclic neural network model to predict the two The relationship between business entities.
  • the present application further provides a computer readable storage medium, where the computer readable storage medium includes an enterprise relationship extraction program, and when the enterprise relationship extraction program is executed by a processor, the foregoing may be implemented as described above. Any step in the enterprise relationship extraction method.
  • the enterprise relationship extraction method, the electronic device and the computer readable storage medium proposed by the application extract the sentence of the enterprise entity pair in the knowledge base from the unstructured text as a training sample and establish a sample library. Then, in the sample library, all training sentences including a pair of business entities are extracted, and each training sentence is segmented, and the sentence vector S i of each training sentence is obtained, and each training sentence is calculated by the long-term and short-term memory module. Feature vector T i . Then the feature vector of each training sample sentence T i, is calculated for each training sample sentences average vector S, the average vector S substituted into the softmax classification function calculation, determining the weight training sample sentences weight a i type affinity business entity pairs .
  • the sentence containing two enterprise entities is extracted from the current text, and the feature vector T of the sentence is obtained through the two-way long-term and short-term memory module.
  • the feature vector T is input into the trained cyclic neural network model to predict the relationship between the two enterprise entities. To improve the ability to identify the relationship between different enterprises in the news, and reduce the dependence on manual training data labeling.
  • FIG. 1 is a schematic diagram of a preferred embodiment of an electronic device of the present application.
  • FIG. 2 is a schematic block diagram of a preferred embodiment of the enterprise relationship extraction procedure of FIG. 1;
  • FIG. 3 is a flow chart of a preferred embodiment of an enterprise relationship extraction method according to the present application.
  • FIG. 4 is a frame diagram of a prediction module of the present application.
  • FIG. 1 is a schematic diagram of a preferred embodiment of an electronic device 1 of the present application.
  • the electronic device 1 may be a server, a smart phone, a tablet computer, a personal computer, a portable computer, and other electronic devices having computing functions.
  • the electronic device 1 includes a memory 11, a processor 12, a knowledge base 13, a network interface 14, and a communication bus 15.
  • the knowledge base 13 is stored in the memory 11, and the sentence containing the pair of enterprise entities is extracted from the knowledge base 13 as a training sample to build a sample library.
  • the network interface 14 can optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • Communication bus 15 is used to implement connection communication between these components.
  • the memory 11 includes at least one type of readable storage medium.
  • the at least one type of readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card type memory, or the like.
  • the memory 11 may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1.
  • the memory 11 may also be an external storage unit of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC), and security. Digital (Secure Digital, SD) card, flash card (Flash Card), etc.
  • the memory 11 can be used not only for storing application software installed on the electronic device 1 and various types of data, such as the enterprise relationship extraction program 10, the knowledge base 13 and the sample library, but also for temporarily Stores data that has been output or will be output.
  • the processor 12 in some embodiments, may be a Central Processing Unit (CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as performing enterprise relationship extraction.
  • CPU Central Processing Unit
  • microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as performing enterprise relationship extraction.
  • the electronic device 1 may further include a display, which may be referred to as a display screen or a display unit.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor, or the like.
  • the display is used to display information processed in the electronic device 1 and a work interface for displaying visualizations, such as: displaying the results of model training and the optimal value of the weight a i .
  • the electronic device 1 may further include a user interface
  • the user interface may include an input unit such as a keyboard, a voice output device such as an audio, a headphone, etc.
  • the user interface may further include a standard wired interface and a wireless interface.
  • the program code of the enterprise relationship extraction program 10 is stored in the memory 11 as a computer storage medium, and when the processor 12 executes the program code of the enterprise relationship extraction program 10, the following steps are implemented:
  • the sample library establishing step extracting the existence entity of the enterprise entity from the knowledge base to establish a sample library as a training sample sentence;
  • Word segmentation step extract all the training sentences containing a pair of business entities from the sample library, use the default word segmentation tool to segment each training sample, and map each word after the word segmentation into the word vector x i , and Each training sentence is mapped into a sentence vector S i as an input to the first layer of the cyclic neural network model;
  • the first hidden layer state vector h i of the current word vector x i is calculated from left to right using the long and short term memory module, and the current word vector x i is calculated from right to left.
  • the second hidden layer state vector h i ' obtains the integrated hidden layer state vector of each word in the training sentence by splicing the two hidden layer state vectors, and then obtains each according to the integrated hidden layer state vector of all words in the training sentence.
  • Calculating step in the third layer of the cyclic neural network model, using the average vector expression to calculate the average vector S of each training sample according to the feature vector T i of each training sample;
  • Weight determining step in the last layer of the cyclic neural network model, the average vector S and the relationship type of the enterprise entity pair are substituted into the softmax classification function to calculate the weight a i of each training sample;
  • Predicting step extracting a sentence containing two business entities from the current text, obtaining a feature vector T i of the sentence through a bidirectional long-term and short-term memory module, and inputting the feature vector T i into the trained cyclic neural network model to predict the two The relationship between business entities.
  • the unstructured sentences including the two enterprise entities can express the relationship. Therefore, when we need to identify the association between two business entities in the news, we extract all unstructured sentences containing the two business entities from the knowledge base, and use the sentences as training samples to build a sample library.
  • the knowledge base is established by collecting unstructured sentences containing any two business entities in historical news data. For example, it is necessary to identify the association between two business entities in the news, extract all unstructured sentences containing the two enterprise entities from the knowledge base, and establish a sample library as the training sample.
  • the relationship between the business entity and the relationship includes capital exchange, supply chain and cooperation. For example, the business entity pair included in the sentence "Foxconn is a supplier of Mobike bicycles" is "Foxconn" and "Mobike”, and the relationship "supplier" between business entities belongs to the supply chain relationship.
  • each training sentence includes the name of the pair of business entities and the relationship type of the pair of business entities, and uses word segmentation tools to perform word segmentation on each training sentence .
  • each training sentence can be processed by word segmentation using the Stanford Chinese word segmentation tool and NLPIR Chinese word segmentation tool.
  • Each word after the word segmentation is expressed in the form of a one-hot vector, and an initial word vector is obtained.
  • the one-hot vector method refers to each word being represented as a very long vector.
  • the dimension of the vector represents the number of words. Only one dimension has a value of 1, and the remaining dimensions are 0.
  • the vector represents the current word.
  • each training sentence includes the two business entity names of Foxconn and Mobike and the relationship type (supplier) of the business entity pair.
  • the word processing of "Foxconn is a supplier of Mobike bicycles", and the following result "Foxconn
  • the initial word vector of "Foxconn” is [0100000000]
  • the initial word vector of "Yes” is [0010000000].
  • each training sentence is labeled with an ID, and the sentence ID is mapped to the initial sentence vector corresponding to the training sentence.
  • the initial sentence vector and the initial word vector of the left and right adjacent words of a word in the training sample are input into the continuous word bag model, and the word vector x i of the word is predicted.
  • Substituting the initial sentence vector update with the first updated sentence vector inputting the first updated sentence vector and the initial word vector of the left and right adjacent words of the next word in the training sample into the continuous word bag model, and predicting The word vector x i+1 of the word replaces the first updated sentence vector update with the second updated sentence vector, so iteratively trains, and each time the training updates the sentence vector of the training sample until the prediction is obtained in the training sentence
  • the "yes" left adjacency word “Foxconn”, the right adjacency word “Mobike” initial word vector and the initial sentence vector are input into the continuous word bag model, and the "yes” word vector x 2 is predicted.
  • the initial sentence vector is updated once to obtain the first updated sentence vector; the initial word vector or the current word vector of the left neighbor adjacent to the available word "yes", the initial word vector of the right adjacent adjacent word “of” and the first
  • An updated sentence vector is input into the continuous word bag model, the word vector x 3 of "Mobike” is predicted, the first updated sentence vector is updated, and the second updated sentence vector is obtained... thus iteratively training until all the above are available for prediction
  • the word vector x i of the word is updated to obtain the sentence vector S i of the training sample. During this process, the sentence ID of each news statement remains unchanged.
  • the long-term short-term memory module (LSTM) is then used from left to right according to the hidden layer state vector h i- of the previous word vector x i-1 of the current word vector x i 1 calculating the first hidden layer state vector h i of the current word vector x i and calculating the current word vector from right to left according to the hidden layer state vector h i+1 of the next word vector x i+1 of the current word vector x i
  • the second hidden layer state vector h i ' of x i the two hidden layer state vectors are stitched by the Concatenate function to obtain the integrated hidden layer state vector of each word in the training sentence, and then according to the integrated hidden layer of all words in the training sentence
  • the first concealment of the word vector x 2 of "Yes” is calculated from left to right by the LSTM according to the hidden layer state vector h 1 of the word vector x 1 of "Foxconn”.
  • the layer state vector h 2 ' and from the right to the left, the second hidden layer state vector h 2 ' of the word vector x 2 of "yes” is calculated according to the hidden layer state vector h 3 of the word vector x 3 of the "Mobike”
  • the Concatenate function concatenates two hidden layer state vectors (h 2 and h 2 ') to obtain a comprehensive hidden layer state vector for each word in the training sample, and then obtains each training based on the integrated hidden layer state vector of all words in the training sample.
  • the feature vector T i of the sample is the probability that is a comprehensive hidden layer state vector for each word in the training sample.
  • a i represents the weight of the training sample
  • T i represents the feature vector of each training sentence
  • n represents the number of training sentences.
  • the average vector S is substituted into the softmax classification function:
  • K represents the number of types of business relationships
  • S represents the average vector that needs to predict the type of business relationship.
  • ⁇ (z) j represents the probability that the type of business relationship that needs to be predicted is in each type of business relationship.
  • the weight a i of the training sentence is determined according to the relationship type of the business entity pair in the training sample. Through continuous learning, the weight a i of the training sentence is continuously optimized, so that the effective sentence obtains a higher weight, and the noisy sentence obtains a smaller weight.
  • the relationship prediction can be performed on any unstructured sentence with a business entity pair, and the prediction of the model is not related to the specific enterprise name.
  • a sentence of a business entity containing two relationships to be predicted is extracted from the current text, and the sentences are segmented to obtain a sentence vector.
  • S 1 , S 2 , S 3 , and S 4 represent a vector set of sentences corresponding to two business entities.
  • the bidirectional long short-term memory (bi-LSTM) extracts the feature vectors T 1 , T 2 , T 3 , T 4 of each sentence, and inputs the feature vectors of each sentence into the trained RNN model. Get the relationship prediction results between the two business entities.
  • the enterprise relationship extraction method proposed in the above embodiment establishes a sample library by extracting training examples of the business entity pairs in the knowledge base from the unstructured text.
  • the sample library contains all the training sentences of a pair of business entities and segmentation, and the sentence vector S i of each training sample is obtained, and the feature vector T i of each training sample is calculated by using LSTM.
  • the average vector S of each training sample is calculated by the calculation formula of the average vector, and the average vector S is substituted into the softmax classification function for calculation, and the weight a i of the training sentence is determined according to the relationship type of the enterprise entity pair.
  • the sentence containing two business entities is extracted from the current text, and the feature vector T i of the sentence is obtained by bi-LSTM.
  • the feature vector T i is input into the trained RNN model to predict the relationship between the two enterprise entities. It reduces the cumbersome manual data marking steps and has better accuracy and recall than other monitoring methods.
  • FIG. 2 it is a block diagram of a preferred embodiment of the enterprise relationship extraction program 10 of FIG.
  • a module as referred to in this application refers to a series of computer program instructions that are capable of performing a particular function.
  • the enterprise relationship extraction program 10 includes: a building module 110, a word segmentation module 120, a splicing module 130, a calculation module 140, a weight determination module 150, a prediction module 160, and functions or operations implemented by the modules 110-160.
  • the steps are all similar to the above, and are not described in detail here, exemplarily, for example:
  • the establishing module 110 is configured to extract a relationship entity from the knowledge base to establish a sample library as a training sample sentence;
  • the word segmentation module 120 is configured to extract all training sentences including a pair of business entities from the sample library, segment each training sentence using a preset word segmentation tool, and map each word after the word segmentation into a word vector x i And mapping each training sentence into a sentence vector S i as an input to the first layer of the RNN model;
  • the splicing module 130 is configured to calculate, in the second layer of the RNN model, the first hidden layer state vector h i of the current word vector x i from left to right with LSTM, and calculate the second of the current word vector x i from right to left
  • the hidden layer state vector h i ' is obtained by splicing two hidden layer state vectors to obtain a comprehensive hidden layer state vector of each word in the training sentence, and then obtaining each training sample according to the integrated hidden layer state vector of all words in the training sample sentence.
  • the calculation module 140 is configured to calculate, in the third layer of the RNN model, an average vector S of each training sample by using an average vector expression according to the feature vector T i of each training sample;
  • the weight determination module 150 is configured to calculate the weight a i of each training sample by substituting the average vector S and the relationship type of the enterprise entity pair into the softmax classification function in the last layer of the RNN model;
  • the prediction module 160 is configured to extract a sentence containing two enterprise entities from the current text, obtain a feature vector T i of the sentence through bi-LSTM, and input the feature vector T i into the trained RNN model to predict the two The relationship between business entities.
  • FIG. 3 it is a flowchart of a preferred embodiment of the enterprise relationship extraction method of the present application.
  • Step S10 extracting a relationship entity from the knowledge base to establish a sample library as a training sample sentence
  • Step S20 extracting all the training sentences including a pair of business entities from the sample library, using a preset word segmentation tool to segment each training sample, and mapping each word after the word segmentation into a word vector x i , and Each training sample is mapped into a sentence vector S i as an input to the first layer of the RNN model;
  • Step S30 in the second layer of the RNN model, calculate the first hidden layer state vector h i of the current word vector x i from left to right with LSTM, and calculate the second hidden layer state of the current word vector x i from right to left.
  • the vector h i ' obtains the integrated hidden layer state vector of each word in the training sentence by splicing the two hidden layer state vectors, and then obtains the characteristics of each training sentence according to the integrated hidden layer state vector of all words in the training sample sentence.
  • Vector T i ;
  • Step S40 in the third layer of the RNN model, according to the feature vector T i of each training sample, using the average vector expression to calculate the average vector S of each training sample;
  • Step S50 in the last layer of the RNN model, the average vector S and the relationship type of the enterprise entity pair are substituted into the softmax classification function to calculate the weight a i of each training sample;
  • Step S60 extracting a sentence containing two business entities from the current text, obtaining a feature vector T i of the sentence through bi-LSTM, and inputting the feature vector T i into the trained RNN model to predict the relationship between the two enterprise entities. Relationship.
  • the unstructured sentences including the two enterprise entities can express the relationship.
  • the knowledge base is established by collecting unstructured sentences containing any two business entities in historical news data. For example, it is necessary to identify the association between two business entities in the news, extract all unstructured sentences containing the two enterprise entities from the knowledge base, and establish a sample library as the training sample.
  • the relationship between the business entity and the relationship includes capital exchange, supply chain and cooperation.
  • a sentence containing two pairs of business entities, "Foxconn” and “Mobike” is extracted from the unstructured text as a training sentence, in which the business entity pair included in the sentence “Foxconn is a supplier of Mobike” "Foxconn” and “Mobike”, the relationship between the business entities "suppliers” belongs to the supply chain relationship.
  • each training sentence includes the name of the pair of business entities and the relationship type of the pair of business entities, and uses word segmentation tools to perform word segmentation on each training sentence .
  • all training samples including Foxconn and Mobike are extracted from the sample library, and each training sentence includes the two business entity names of Foxconn and Mobike and the relationship type (supplier) of the business entity pair.
  • Each training sentence is processed by word segmentation using the Stanford Chinese word segmentation tool and the NLPIR Chinese word segmentation tool. For example, the word processing of "Foxconn is a supplier of Mobike bicycles", and the following result "Foxconn
  • Each word after the word segmentation is expressed in the form of a one-hot vector, and an initial word vector is obtained.
  • the one-hot vector method refers to each word being represented as a very long vector.
  • the dimension of the vector represents the number of words. Only one dimension has a value of 1, and the remaining dimensions are 0.
  • the vector represents the current word. For example, the initial word vector of "Foxconn" is [0100000000], and the initial word vector of "Yes" is [0010000000].
  • each training sentence is labeled with an ID, and the sentence ID is mapped to the initial sentence vector corresponding to the training sentence.
  • the initial sentence vector and the initial word vector of the left and right adjacent words of a word in the training sample are input into the continuous word bag model, and the word vector x i of the word is predicted. Substituting the initial sentence vector update with the first updated sentence vector, inputting the first updated sentence vector and the initial word vector of the left and right adjacent words of the next word in the training sample into the continuous word bag model, and predicting The word vector x i+1 of the word replaces the first updated sentence vector update with the second updated sentence vector, so iteratively trains, and updates the sentence vector of the training sample every training until the predicted training sentence is obtained.
  • the initial word vector of the left adjacent adjoining word "Foxconn” the right adjacent adjective "Mobike” and the initial sentence vector are input into the continuous word bag model.
  • the initial word vector of the right adjacent word “of” and the first updated sentence vector are input into the continuous word bag model, and the word vector x 3 of "Mobike” is predicted, and the first updated sentence vector is updated to obtain the second updated sentence.
  • the vector ...is iteratively trained until the word vector x i of all the available words is predicted, and the sentence vector S i of the training sample is updated. During this process, the sentence ID of each news statement remains unchanged.
  • RNN model the second layer, followed by LSTM calculated from left to right a word before the current word vector x i in the vector x is the state vector of the hidden layer i-1 h i-1 of the first hidden layer of the current word vector x i state vector h i, calculate the current word from right to left and the vector x i according to the hidden layer state vector h i a word after the current term vectors vector x i + x i + 1 of the second hidden layer 1 state vector h i '
  • the Con hiddennate function is used to splicing two hidden layer state vectors to obtain the integrated hidden layer state vector of each word in the training sentence, and then the feature vector T of each training sample is obtained according to the integrated hidden layer state vector of all words in the training sample.
  • the layer state vector h 2 ' and from the right to the left, the second hidden layer state vector h 2 ' of the word vector x 2 of "yes” is calculated according to the hidden layer state vector h 3 of the word vector x 3 of the "Mobike”
  • the Concatenate function concatenates two hidden layer state vectors (h 2 and h 2 ') to obtain a comprehensive hidden layer state vector for each word in the training sample, and then obtains each training based on the integrated hidden layer state vector of all words in the training sample.
  • the feature vector T i of the sample is the probability that is a comprehensive hidden layer state vector for each word in the training sample.
  • a i represents the weight of the training sample
  • T i represents the feature vector of each training sentence
  • n represents the number of training sentences.
  • S sum(a i *T i )/n, and calculate the average vector S of each training sentence.
  • n is equal to 50,000.
  • the average vector S is then substituted into the softmax classification function:
  • K represents the number of types of business relationships
  • S represents the average vector that needs to predict the type of business relationship.
  • ⁇ (z) j represents the probability that the type of business relationship that needs to be predicted is in each type of business relationship.
  • the weight a i of the training sentence is determined according to the relationship type of the business entity pair in the training sample. Through continuous iterative learning, the weight a i of the training sentence is continuously optimized, so that the effective sentence obtains a higher weight, and the noisy sentence obtains a smaller weight, thereby obtaining a reliable RNN model.
  • the relationship prediction can be performed on any unstructured sentence with a business entity pair, and the prediction of the model is not related to the specific enterprise name.
  • FIG. 4 it is a frame diagram of the prediction module of the present application. Extract the sentence of the enterprise entity containing the two relationships to be predicted from the current text, such as extracting sentences containing "China Ping An Group” and "Bank of China” from the news, and segmenting the sentences to obtain the sentence vector.
  • S 1 , S 2 , S 3 , and S 4 represent a vector set of sentences corresponding to two business entities.
  • the feature vectors T 1 , T 2 , T 3 , T 4 of each sentence are extracted by bi-LSTM, and the weight of T i in the entire sentence set is given by calculating the similarity between T i and the relationship type r vector, and finally in each After the sentences are weighted, the relationship between “China Ping An Group” and “Bank of China” is predicted by the softmax classifier.
  • the enterprise relationship extraction method proposed in the above embodiment extracts a sentence of a business entity pair in a knowledge base from a non-structured text as a training sentence and establishes a sample library.
  • the sample library contains all the training sentences of a pair of business entities and segmentation, and the sentence vector S i of each training sample is obtained, and the feature vector T i of each training sample is calculated by using LSTM.
  • the average vector S of each training sample is calculated by the calculation formula of the average vector, and the average vector S is substituted into the softmax classification function for calculation, and the weight a i of the training sentence is determined according to the relationship type of the enterprise entity pair.
  • the sentence containing two business entities is extracted from the current text, and the feature vector T i of the sentence is obtained by bi-LSTM.
  • the feature vector T i is input into the trained RNN model to predict the relationship between the two enterprise entities.
  • the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium includes an enterprise relationship extraction program 10, and when the enterprise relationship extraction program 10 is executed by the processor, the following operations are implemented:
  • the sample library establishing step extracting the existence entity of the enterprise entity from the knowledge base to establish a sample library as a training sample sentence;
  • Word segmentation step extract all the training sentences containing a pair of business entities from the sample library, use the default word segmentation tool to segment each training sample, and map each word after the word segmentation into the word vector x i , and Each training sample is mapped into a sentence vector S i as an input to the first layer of the RNN model;
  • the first hidden layer state vector h i of the current word vector x i is calculated from left to right by LSTM, and the second hidden layer state of the current word vector x i is calculated from right to left
  • the vector h i ' obtains the integrated hidden layer state vector of each word in the training sentence by splicing the two hidden layer state vectors, and then obtains the characteristics of each training sentence according to the integrated hidden layer state vector of all words in the training sample sentence.
  • Vector T i ;
  • Calculating step in the third layer of the RNN model, using the average vector expression to calculate the average vector S of each training sample according to the feature vector T i of each training sample;
  • Weight determining step in the last layer of the RNN model, the average vector S and the relationship type of the enterprise entity pair are substituted into the softmax classification function to calculate the weight a i of each training sample;
  • Predicting step extracting a sentence containing two business entities from the current text, obtaining a feature vector T i of the sentence through bi-LSTM, inputting the feature vector T i into the trained RNN model, and predicting the relationship between the two enterprise entities Relationship.
  • the word segmentation step comprises:
  • Each word after the word segmentation is expressed in the form of a one-hot vector, and the initial word vector is obtained, and the sentence ID is marked for each training sentence, and the sentence ID is mapped to the initial sentence vector corresponding to the training sentence, and the initial sentence is obtained.
  • the vector and the initial word vector of the left and right adjacent words of a certain word in the training sentence are input into the continuous word bag model, and the word vector x i of the word is predicted, and the sentence vector of the training sentence is updated every time. Until the word vector x i of each word in the training sentence is predicted, the sentence vector after the last update is used as the sentence vector S i of the training sample.
  • the splicing step comprises:
  • From left to right current word vector x i is calculated in accordance with a word before the current word vector x i in the vector x is the state vector of the hidden layer i-1 h i-1 of the first hidden layer state vector h i, and from right to left in accordance with a hidden layer word after the current state vector term vectors x i x i + 1 vector h i + 1 is calculated in the second hidden layer, h the current state vector of term vectors x i i '.
  • the average vector expression is:
  • T i the feature vector of each training sentence
  • n the number of training sentences.
  • the softmax classification function expression is:
  • K represents the number of types of business relationships
  • S represents the average vector that needs to predict the type of business relationship.
  • ⁇ (z) j represents the probability that the type of business relationship that needs to be predicted is in each type of business relationship.
  • the technical solution of the present application which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM as described above). , a disk, an optical disk, including a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the various embodiments of the present application.
  • a terminal device which may be a mobile phone, a computer, a server, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Databases & Information Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)

Abstract

本申请公开了一种企业关系提取方法、装置及存储介质,该方法包括:从知识库中抽取存在关系的企业实体对句子作为训练样句建立样本库;从样本库中抽取包含一个企业实体对的所有训练样句并分词,将每个词映射成词向量x i,映射成句子向量S i;用LSTM计算词向量x i的第一隐藏层状态向量h i和第二隐藏层状态向量h i',拼接得到综合隐藏层状态向量,再得到特征向量T i;将特征向量T i代入平均向量表达式算出平均向量S;将平均向量S及企业实体对的关系类型代入softmax分类函数算出每个训练样句的权重a i;提取包含两个企业实体的句子,经过bi-LSTM得到特征向量T i,输入到训练好的RNN模型,预测该两个企业的关系,减少人工成本,更准确的预测该两个企业实体间的关系。

Description

企业关系提取方法、装置及存储介质
优先权申明
本申请要求于2017年11月2日提交中国专利局、申请号为201711061205.0,名称为“企业关系提取方法、装置及存储介质”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合本申请中。
技术领域
本申请涉及数据信息处理技术领域,尤其涉及一种企业关系提取方法、装置及计算机可读存储介质。
背景技术
识别新闻中不同企业之间的关联,如资金往来、供应链、合作等,对企业风险预警有很重大的意义。然而现在常见的实体关系抽取方法需要人工进行大量训练数据的标注,而语料标注工作一般非常耗时耗力。
发明内容
鉴于以上内容,本申请提供一种企业关系提取方法、装置及计算机可读存储介质,可以将基于卷积神经网络的关系提取模型扩展到远程监督数据上,有效地减少模型对人工标注数据的依赖,而且这种有监督的企业关系提取方法相比于半监督或无监督方法具有更好的准确率和召回率。
为实现上述目的,本申请提供一种企业关系提取方法,该方法包括:
样本库建立步骤:从知识库中抽取存在关系的企业实体对句子作为训练样句建立样本库;
分词步骤:从样本库中抽取包含一个企业实体对的所有训练样句,使用预设的分词工具对每个训练样句进行分词,将分词后的每个词映射成词向量x i,并将每个训练样句映射成句子向量S i,作为循环神经网络模型第一层的输入;
拼接步骤:在循环神经网络模型的第二层,用长短期记忆模块从左向右计算当前词向量x i的第一隐藏层状态向量h i,并从右向左计算当前词向量x i的第二隐藏层状态向量h i’,通过拼接两个隐藏层状态向量得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
计算步骤:在循环神经网络模型的第三层,根据每个训练样句的特征向量T i,利用平均向量表达式算出每个训练样句的平均向量S;
权重确定步骤:在循环神经网络模型的最后一层,将所述平均向量S及所述企业实体对的关系类型代入softmax分类函数计算得到每个训练样句的权重a i
预测步骤:从当前文本中提取包含两个企业实体的句子,经过双向长短期记忆模块得到句子的特征向量T i,将该特征向量T i输入上述训练好的循环神经网络模型,预测得到该两个企业实体间的关系。
此外,本申请还提供一种电子装置,该电子装置包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的企业关系提取程序,所述企业关系提取程序被所述处理器执行,可实现如下步骤:
样本库建立步骤:从知识库中抽取存在关系的企业实体对句子作为训练样句建立样本库;
分词步骤:从样本库中抽取包含一个企业实体对的所有训练样句,使用预设的分词工具对每个训练样句进行分词,将分词后的每个词映射成词向量x i,并将每个训练样句映射成句子向量S i,作为循环神经网络模型第一层的输入;
拼接步骤:在循环神经网络模型的第二层,用长短期记忆模块从左向右计算当前词向量x i的第一隐藏层状态向量h i,并从右向左计算当前词向量x i的第二隐藏层状态向量h i’,通过拼接两个隐藏层状态向量得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
计算步骤:在循环神经网络模型的第三层,根据每个训练样句的特征向量T i,利用平均向量表达式算出每个训练样句的平均向量S;
权重确定步骤:在循环神经网络模型的最后一层,将所述平均向量S及所述企业实体对的关系类型代入softmax分类函数计算得到每个训练样句的权重a i
预测步骤:从当前文本中提取包含两个企业实体的句子,经过双向长短期记忆模块得到句子的特征向量T i,将该特征向量T i输入上述训练好的循环神经网络模型,预测得到该两个企业实体间的关系。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中包括企业关系提取程序,所述企业关系提取程序被处理器执行时,可实现如上所述企业关系提取方法中的任意步骤。
本申请提出的企业关系提取方法、电子装置及计算机可读存储介质,从非结构化文本中抽取知识库中存在关系的企业实体对的句子作为训练样句并建立样本库。接着在样本库抽取包含一个企业实体对的所有训练样句,并对每个训练样句进行分词,得到每个训练样句的句子向量S i,通过长短期记忆模块算出每个训练样句的特征向量T i。然后根据每个训练样句的特征向量T i,计算每个训练样句的平均向量S,将平均向量S代入softmax分类函数进行计算,根据企业实体对的关系类型确定训练样句的权重a i。最后从当前文本中提取包含两个企业实体的句子,经过双向长短期记忆模块得到句子的特征向量T,将该特征向量T输入训练好的循环神经网络模型,预测该两个企业实体间的关系,提高在新闻中对不同企业间关系的识别能力,减少对人工进行训练数据标注的依赖。
附图说明
图1为本申请电子装置较佳实施例的示意图;
图2为图1中企业关系提取程序较佳实施例的模块示意图;
图3为本申请企业关系提取方法较佳实施例的流程图;
图4为本申请预测模块的框架图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
如图1所示,是本申请电子装置1较佳实施例的示意图。
在本实施例中,电子装置1可以是服务器、智能手机、平板电脑、个人电脑、便携计算机以及其他具有运算功能的电子设备。
该电子装置1包括:存储器11、处理器12、知识库13、网络接口14及通信总线15。其中,知识库13存储在存储器11上,从知识库13中抽取出含有企业实体对的句子作为训练样句建立样本库。
其中,网络接口14可选地可以包括标准的有线接口、无线接口(如WI-FI接口)。通信总线15用于实现这些组件之间的连接通信。
存储器11至少包括一种类型的可读存储介质。所述至少一种类型的可读存储介质可为如闪存、硬盘、多媒体卡、卡型存储器等的非易失性存储介质。 在一些实施例中,所述存储器11可以是所述电子装置1的内部存储单元,例如该电子装置1的硬盘。在另一些实施例中,所述存储器11也可以是所述电子装置1的外部存储单元,例如所述电子装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。
在本实施例中,所述存储器11不仅可以用于存储安装于所述电子装置1的应用软件及各类数据,例如企业关系提取程序10、知识库13和样本库,还可以用于暂时地存储已经输出或者将要输出的数据。
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其它数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行企业关系提取程序10的计算机程序代码和各类模型的训练等。
优选地,该电子装置1还可以包括显示器,显示器可以称为显示屏或显示单元。在一些实施例中显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。显示器用于显示在电子装置1中处理的信息以及用于显示可视化的工作界面,如:显示模型训练的结果及权重a i的最优值。
优选地,该电子装置1还可以包括用户接口,用户接口可以包括输入单元比如键盘(Keyboard)、语音输出装置比如音响、耳机等,可选地用户接口还可以包括标准的有线接口、无线接口。
在图1所示的装置实施例中,作为一种计算机存储介质的存储器11中存储企业关系提取程序10的程序代码,处理器12执行企业关系提取程序10的程序代码时,实现如下步骤:
样本库建立步骤:从知识库中抽取存在关系的企业实体对句子作为训练样句建立样本库;
分词步骤:从样本库中抽取包含一个企业实体对的所有训练样句,使用预设的分词工具对每个训练样句进行分词,将分词后的每个词映射成词向量x i,并将每个训练样句映射成句子向量S i,作为循环神经网络模型第一层的输入;
拼接步骤:在循环神经网络模型的第二层,用长短期记忆模块从左向右计算当前词向量x i的第一隐藏层状态向量h i,并从右向左计算当前词向量x i的第二隐藏层状态向量h i’,通过拼接两个隐藏层状态向量得到训练样句中每个 词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
计算步骤:在循环神经网络模型的第三层,根据每个训练样句的特征向量T i,利用平均向量表达式算出每个训练样句的平均向量S;
权重确定步骤:在循环神经网络模型的最后一层,将所述平均向量S及所述企业实体对的关系类型代入softmax分类函数计算得到每个训练样句的权重a i
预测步骤:从当前文本中提取包含两个企业实体的句子,经过双向长短期记忆模块得到句子的特征向量T i,将该特征向量T i输入上述训练好的循环神经网络模型,预测得到该两个企业实体间的关系。
本实施例中,假设两个企业实体在知识库中存在某种关系,则包含该两个企业实体的非结构化句子均能表示出这种关系。因此,当我们需要识别新闻中某两个企业实体之间的关联时,从知识库中抽出包含该两个企业实体的所有非结构化句子,将所述句子作为训练样句建立样本库。其中,所述知识库是通过收集历史新闻数据中包含任意两个企业实体的非结构化句子建立的。例如,需要识别新闻中某两个企业实体之间的关联,从知识库中抽取含有该两个企业实体的所有非结构化句子,并将所述句子作为训练样句建立一个样本库。其中企业实体对存在的关系包括资金往来、供应链和合作等关系。例如,句子“富士康是摩拜单车的供应商”中包含的企业实体对为“富士康”、“摩拜单车”,企业实体之间的关系“供应商”属于供应链关系。
从样本库中抽取包含一个企业实体对的所有训练样句,每个训练样句包括该对企业实体的名称和该企业实体对的关系类型,并使用分词工具对每个训练样句进行分词处理。其中,可以使用Stanford汉语分词工具、NLPIR汉语分词等分词工具对每个训练样句进行分词处理。对分词后的每个词以one-hot向量的形式表示,得到初始词向量。其中one-hot向量的方法指把每个词表示为一个很长的向量,向量的维度表示词的多少,其中只有一个维度的值为1,其余维度为0,该向量代表当前词。例如,从样本库中抽取包含富士康和摩拜单车的所有训练样句,而且每个训练样句都包括富士康与摩拜单车该两个企业实体名称和该企业实体对的关系类型(供应商)。对“富士康是摩拜单车的供应商”进行分词处理,得到如下结果“富士康|是|摩拜单车|的|供应商”。如“富士康”的初始词向量为[0100000000]、“是”的初始词向量为[0010000000]。然后为每个训练样句标注ID,将句子ID映射为对应训练样句的初始句子向量。
将该初始句子向量和训练样句中某个词的左、右邻接词的初始词向量输入所述连续词袋模型,预测得到该词的词向量x i。将所述初始句子向量更新替换为第一更新句子向量,将该第一更新句子向量和训练样句中下一个词的左、右邻接词的初始词向量输入所述连续词袋模型,预测得到该词的词向量x i+1,将所述第一更新句子向量更新替换为第二更新句子向量,如此迭代训练,每次训练更新该训练样句的句子向量,直至预测得到训练样句中每个词的词向量x i,i=(0,1,2,3,...,m),将最后一次训练更新后的句子向量作为该训练样句的句子向量S i,i=(0,1,2,3,...,n)。作为循环神经网络(Recurrent Neural Network,RNN)模型第一层的输入。例如,将“是”的左邻接可用词“富士康”、右邻接可用词“摩拜单车”的初始词向量以及初始句子向量输入连续词袋模型,预测得到“是”的词向量x 2,对初始句子向量进行一次更新,得到第一更新句子向量;将“摩拜单车”的左邻接可用词“是”的初始词向量或当前词向量、右邻接可用词“的”的初始词向量和第一更新句子向量输入连续词袋模型,预测得到“摩拜单车”的词向量x 3,对第一更新句子向量进行更新,得到第二更新句子向量……如此迭代训练,直至预测得到上述所有可用词的词向量x i,更新得到该训练样句的句子向量S i。在此过程中,每个新闻语句的句子ID始终保持不变。
在RNN模型的第二层,接着用长短期记忆模块(Long Short-Term Memory,LSTM)从左向右根据当前词向量x i的前一个词向量x i-1的隐藏层状态向量h i-1计算当前词向量x i的第一隐藏层状态向量h i,并从右向左根据当前词向量x i的后一个词向量x i+1的隐藏层状态向量h i+1计算当前词向量x i的第二隐藏层状态向量h i’,通过Concatenate函数拼接两个隐藏层状态向量得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i,i=(0,1,2,3,...,n)。例如,“富士康是摩拜单车的供应商”句子中,用LSTM从左向右根据“富士康”的词向量x 1的隐藏层状态向量h 1计算“是”的词向量x 2的第一隐藏层状态向量h 2,并从右向左根据“摩拜单车”的词向量x 3的隐藏层状态向量h 3计算“是”的词向量x 2的第二隐藏层状态向量h 2’,通过Concatenate函数拼接两个隐藏层状态向量(h 2和h 2’)得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
在RNN模型的第三层,根据每个训练样句的特征向量T i,利用平均向量的计算公式:S=sum(a i*T i)/n,算出每个训练样句的平均向量S。其中a i代表训练样句的权重,T i代表每个训练样句的特征向量,n代表训练样句的数量。
在RNN模型的最后一层,将平均向量S代入到softmax分类函数:
Figure PCTCN2018076119-appb-000001
其中K代表企业关系类型的个数,S代表需要预测企业关系类型的平均向量,
Figure PCTCN2018076119-appb-000002
代表某种企业关系类型,σ(z) j代表需要预测的企业关系类型在每个企业关系类型中的概率。根据训练样句中企业实体对的关系类型,确定训练样句的权重a i。通过不断地学习,不断优化训练样句的权重a i,使得有效句子获得较高的权重,而有噪音的句子获得较小的权重。
在本实施例中,当RNN模型确定后,可以对任意一个带有企业实体对的非结构化句子进行关系预测,模型的预测和具体的企业名称没有关联。
从当前文本中提取包含两个待预测关系的企业实体的句子,并对这些句子进行分词得到句子向量。例如,S 1,S 2,S 3,S 4表示的是两个企业实体对应的句子的向量集合。经过双向长短期记忆模块(Bidirectional Long Short-term Memory,bi-LSTM)提取出各个句子的特征向量T 1,T 2,T 3,T 4,将各个句子的特征向量输入训练好的RNN模型,得到该两个企业实体间的关系预测结果。
上述实施例提出的企业关系提取方法,通过从非结构化文本中抽取知识库中存在关系的企业实体对的训练样句建立样本库。抽取样本库中包含一个企业实体对的所有训练样句并进行分词,得到每个训练样句的句子向量S i,利用LSTM算出每个训练样句的特征向量T i。通过平均向量的计算公式算出每个训练样句的平均向量S,将平均向量S代入softmax分类函数进行计算,根据企业实体对的关系类型确定训练样句的权重a i。最后从当前文本中提取包含两个企业实体的句子,经过bi-LSTM得到句子的特征向量T i,将该特征向量T i输入训练好的RNN模型,预测该两个企业实体间的关系,不仅减少了繁琐的训练数据人工标注步骤,而且比其它监督方式有更好的准确率和召回率。
如图2所示,是图1中企业关系提取程序10较佳实施例的模块示意图。本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段。
在本实施例中,企业关系提取程序10包括:建立模块110、分词模块120、拼接模块130、计算模块140、权重确定模块150、预测模块160,所述模块110-160所实现的功能或操作步骤均与上文类似,此处不再详述,示例性地,例如其中:
建立模块110,用于从知识库中抽取存在关系的企业实体对句子作为训练样句建立样本库;
分词模块120,用于从样本库中抽取包含一个企业实体对的所有训练样句,使用预设的分词工具对每个训练样句进行分词,将分词后的每个词映射成词向量x i,并将每个训练样句映射成句子向量S i,作为RNN模型第一层的输入;
拼接模块130,用于在RNN模型的第二层,用LSTM从左向右计算当前词向量x i的第一隐藏层状态向量h i,并从右向左计算当前词向量x i的第二隐藏层状态向量h i’,通过拼接两个隐藏层状态向量得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
计算模块140,用于在RNN模型的第三层,根据每个训练样句的特征向量T i,利用平均向量表达式算出每个训练样句的平均向量S;
权重确定模块150,用于在RNN模型的最后一层,将所述平均向量S及所述企业实体对的关系类型代入softmax分类函数计算得到每个训练样句的权重a i
预测模块160,用于从当前文本中提取包含两个企业实体的句子,经过bi-LSTM得到句子的特征向量T i,将该特征向量T i输入上述训练好的RNN模型,预测得到该两个企业实体间的关系。
如图3所示,是本申请企业关系提取方法较佳实施例的流程图。
在本实施例中,处理器12执行存储器11中存储的企业关系提取程序10的计算机程序时实现企业关系提取方法的如下步骤:
步骤S10,从知识库中抽取存在关系的企业实体对句子作为训练样句建立样本库;
步骤S20,从样本库中抽取包含一个企业实体对的所有训练样句,使用预设的分词工具对每个训练样句进行分词,将分词后的每个词映射成词向量x i,并将每个训练样句映射成句子向量S i,作为RNN模型第一层的输入;
步骤S30,在RNN模型的第二层,用LSTM从左向右计算当前词向量x i的第一隐藏层状态向量h i,并从右向左计算当前词向量x i的第二隐藏层状态向量h i’,通过拼接两个隐藏层状态向量得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
步骤S40,在RNN模型的第三层,根据每个训练样句的特征向量T i,利用平均向量表达式算出每个训练样句的平均向量S;
步骤S50,在RNN模型的最后一层,将所述平均向量S及所述企业实体 对的关系类型代入softmax分类函数计算得到每个训练样句的权重a i
步骤S60,从当前文本中提取包含两个企业实体的句子,经过bi-LSTM得到句子的特征向量T i,将该特征向量T i输入上述训练好的RNN模型,预测得到该两个企业实体间的关系。
本实施例中,假设两个企业实体在知识库中存在某种关系,则包含该两个企业实体的非结构化句子均能表示出这种关系。当我们需要识别新闻中某两个企业实体之间的关联时,从知识库中抽出包含该两个企业实体的所有非结构化句子,将所述句子作为训练样句建立样本库。其中,所述知识库是通过收集历史新闻数据中包含任意两个企业实体的非结构化句子建立的。例如,需要识别新闻中某两个企业实体之间的关联,从知识库中抽取含有该两个企业实体的所有非结构化句子,并将所述句子作为训练样句建立一个样本库。其中企业实体对存在的关系包括资金往来、供应链和合作等关系。例如,从非结构化文本中抽取含有“富士康”和“摩拜单车”两个企业实体对的句子作为训练样句,其中句子“富士康是摩拜单车的供应商”中包含的企业实体对为“富士康”、“摩拜单车”,企业实体之间的关系“供应商”属于供应链关系。
从样本库中抽取包含一个企业实体对的所有训练样句,每个训练样句包括该对企业实体的名称和该企业实体对的关系类型,并使用分词工具对每个训练样句进行分词处理。例如,从样本库中抽取包含富士康和摩拜单车的所有训练样句,而且每个训练样句都包括富士康与摩拜单车该两个企业实体名称和该企业实体对的关系类型(供应商)。使用Stanford汉语分词工具、NLPIR汉语分词等分词工具对每个训练样句进行分词处理。例如:对“富士康是摩拜单车的供应商”进行分词处理,得到如下结果“富士康|是|摩拜单车|的|供应商”。对分词后的每个词以one-hot向量的形式表示,得到初始词向量。其中one-hot向量的方法指把每个词表示为一个很长的向量,向量的维度表示词的多少,其中只有一个维度的值为1,其余维度为0,该向量代表当前词。例如,“富士康”的初始词向量为[0100000000]、“是”的初始词向量为[0010000000]。然后为每个训练样句标注ID,将句子ID映射为对应训练样句的初始句子向量。
将该初始句子向量和训练样句中某个词的左、右邻接词的初始词向量输入所述连续词袋模型,预测得到该词的词向量x i。将所述初始句子向量更新替换为第一更新句子向量,将该第一更新句子向量和训练样句中下一个词的左、右邻接词的初始词向量输入所述连续词袋模型,预测得到该词的词向量x i+1,将所述第一更新句子向量更新替换为第二更新句子向量,如此迭代训练, 每次训练更新该训练样句的句子向量,直至预测得到训练样句中每个词的词向量x i,i=(0,1,2,3,...,m),将最后一次训练更新后的句子向量作为该训练样句的句子向量S i,i=(0,1,2,3,...,n)。例如,“富士康是摩拜单车的供应商”句子中,将“是”的左邻接可用词“富士康”、右邻接可用词“摩拜单车”的初始词向量以及初始句子向量输入连续词袋模型,预测得到“是”的词向量x 2,对初始句子向量进行一次更新,得到第一更新句子向量;将“摩拜单车”的左邻接可用词“是”的初始词向量或当前词向量、右邻接可用词“的”的初始词向量和第一更新句子向量输入连续词袋模型,预测得到“摩拜单车”的词向量x 3,对第一更新句子向量进行更新,得到第二更新句子向量……如此迭代训练,直至预测得到上述所有可用词的词向量x i,更新得到该训练样句的句子向量S i。在此过程中,每个新闻语句的句子ID始终保持不变。
在RNN模型的第二层,接着用LSTM从左向右根据当前词向量x i的前一个词向量x i-1的隐藏层状态向量h i-1计算当前词向量x i的第一隐藏层状态向量h i,并从右向左根据当前词向量x i的后一个词向量x i+1的隐藏层状态向量h i+1计算当前词向量x i的第二隐藏层状态向量h i’,通过Concatenate函数拼接两个隐藏层状态向量得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i,i=(0,1,2,3,...,n)。例如,“富士康是摩拜单车的供应商”句子中,用LSTM从左向右根据“富士康”的词向量x 1的隐藏层状态向量h 1计算“是”的词向量x 2的第一隐藏层状态向量h 2,并从右向左根据“摩拜单车”的词向量x 3的隐藏层状态向量h 3计算“是”的词向量x 2的第二隐藏层状态向量h 2’,通过Concatenate函数拼接两个隐藏层状态向量(h 2和h 2’)得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
在RNN模型的第三层,根据每个训练样句的特征向量T i,利用平均向量的计算公式:S=sum(a i*T i)/n,算出每个训练样句的平均向量S。其中a i代表训练样句的权重,T i代表每个训练样句的特征向量,n代表训练样句的数量。假设,从知识库中抽取“富士康”和“摩拜单车”实体对的训练样句有5万条,则将每条训练样句的特征向量T i,i=(0,1,2,3,...,n)代入平均向量的计算公式:S=sum(a i*T i)/n,算出每个训练样句的平均向量S。其中n等于5万。
在RNN模型的最后一层,然后将平均向量S代入到softmax分类函数:
Figure PCTCN2018076119-appb-000003
其中K代表企业关系类型的个数,S代表需要预测企业关系类型的平均向量,
Figure PCTCN2018076119-appb-000004
代表某种企业关系类型,σ(z) j代表需要预测的企业关系类型在每个企业关系类型中的概率。根据训练样句中企业实体对的关系类型,确定训练样句的权重a i。通过不断地迭代学习,不断优化训练样句的权重a i,使得有效句子获得较高的权重,而有噪音的句子获得较小的权重,从而得到可靠的RNN模型。
在本实施例中,当RNN模型确定后,可以对任意一个带有企业实体对的非结构化句子进行关系预测,模型的预测和具体的企业名称没有关联。
最后,如图4所示,是本申请预测模块的框架图。从当前文本中提取包含两个待预测关系的企业实体的句子,如从新闻中提取包含“中国平安集团”与“中国银行”的句子,并对这些句子进行分词得到句子向量。例如,S 1,S 2,S 3,S 4表示的是两个企业实体对应的句子的向量集合。经过bi-LSTM提取出各个句子的特征向量T 1,T 2,T 3,T 4,再通过计算T i与关系类型r向量的相似度来赋予T i在整个句子集中的权重,最后在各个句子加权取和后通过softmax分类器预测出“中国平安集团”与“中国银行”之间的关系。
上述实施例提出的企业关系提取方法,通过从非结构化文本中抽取知识库中存在关系的企业实体对的句子作为训练样句并建立样本库。抽取样本库中包含一个企业实体对的所有训练样句并进行分词,得到每个训练样句的句子向量S i,利用LSTM算出每个训练样句的特征向量T i。然后通过平均向量的计算公式算出每个训练样句的平均向量S,将平均向量S代入softmax分类函数进行计算,根据企业实体对的关系类型确定训练样句的权重a i。最后从当前文本中提取包含两个企业实体的句子,经过bi-LSTM得到句子的特征向量T i,将该特征向量T i输入训练好的RNN模型,预测该两个企业实体间的关系,提高在新闻中对不同企业间关系的识别能力和对企业风险的预警,减少繁琐的训练数据人工标注步骤。
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质中包括企业关系提取程序10,所述企业关系提取程序10被处理器执行时实现如下操作:
样本库建立步骤:从知识库中抽取存在关系的企业实体对句子作为训练样句建立样本库;
分词步骤:从样本库中抽取包含一个企业实体对的所有训练样句,使用预设的分词工具对每个训练样句进行分词,将分词后的每个词映射成词向量 x i,并将每个训练样句映射成句子向量S i,作为RNN模型第一层的输入;
拼接步骤:在RNN模型的第二层,用LSTM从左向右计算当前词向量x i的第一隐藏层状态向量h i,并从右向左计算当前词向量x i的第二隐藏层状态向量h i’,通过拼接两个隐藏层状态向量得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
计算步骤:在RNN模型的第三层,根据每个训练样句的特征向量T i,利用平均向量表达式算出每个训练样句的平均向量S;
权重确定步骤:在RNN模型的最后一层,将所述平均向量S及所述企业实体对的关系类型代入softmax分类函数计算得到每个训练样句的权重a i
预测步骤:从当前文本中提取包含两个企业实体的句子,经过bi-LSTM得到句子的特征向量T i,将该特征向量T i输入上述训练好的RNN模型,预测得到该两个企业实体间的关系。
优选地,所述分词步骤包括:
对分词后的每个词以one-hot向量的形式表示,得到初始词向量,并为每个训练样句标注句子ID,将句子ID映射为对应训练样句的初始句子向量,将该初始句子向量和该训练样句中某个词的左、右邻接词的初始词向量输入所述连续词袋模型,预测得到该词的词向量x i,每次预测更新该训练样句的句子向量,直至预测得到该训练样句中每个词的词向量x i,以最后一次更新后的句子向量作为该训练样句的句子向量S i
优选地,所述拼接步骤包括:
从左向右根据当前词向量x i的前一个词向量x i-1的隐藏层状态向量h i-1计算当前词向量x i的第一隐藏层状态向量h i,并从右向左根据当前词向量x i的后一个词向量x i+1的隐藏层状态向量h i+1计算当前词向量x i的第二隐藏层状态向量h i’。
优选地,所述平均向量表达式为:
S=sum(a i*T i)/n
其中a i代表训练样句的权重,T i代表每个训练样句的特征向量,n代表训练样句的数量。
优选地,所述softmax分类函数表达式为:
Figure PCTCN2018076119-appb-000005
其中K代表企业关系类型的个数,S代表需要预测企业关系类型的平均向量,
Figure PCTCN2018076119-appb-000006
代表某种企业关系类型,σ(z) j代表需要预测的企业关系类型在每个 企业关系类型中的概率。
本申请之计算机可读存储介质的具体实施方式与上述企业关系提取方法的具体实施方式大致相同,在此不再赘述。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种企业关系提取方法,其特征在于,所述方法包括:
    样本库建立步骤:从知识库中抽取存在关系的企业实体对句子作为训练样句建立样本库;
    分词步骤:从样本库中抽取包含一个企业实体对的所有训练样句,使用预设的分词工具对每个训练样句进行分词,将分词后的每个词映射成词向量x i,并将每个训练样句映射成句子向量S i,作为循环神经网络模型第一层的输入;
    拼接步骤:在循环神经网络模型的第二层,用长短期记忆模块从左向右计算当前词向量x i的第一隐藏层状态向量h i,并从右向左计算当前词向量x i的第二隐藏层状态向量h i’,通过拼接两个隐藏层状态向量得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
    计算步骤:在循环神经网络模型的第三层,根据每个训练样句的特征向量T i,利用平均向量表达式算出每个训练样句的平均向量S;
    权重确定步骤:在循环神经网络模型的最后一层,将所述平均向量S及所述企业实体对的关系类型代入softmax分类函数计算得到每个训练样句的权重a i
    预测步骤:从当前文本中提取包含两个企业实体的句子,经过双向长短期记忆模块得到句子的特征向量T i,将该特征向量T i输入上述训练好的循环神经网络模型,预测得到该两个企业实体间的关系。
  2. 根据权利要求1所述的企业关系提取方法,其特征在于,所述分词步骤包括:
    对分词后的每个词以one-hot向量的形式表示,得到初始词向量,并为每个训练样句标注句子ID,将句子ID映射为对应训练样句的初始句子向量,将该初始句子向量和该训练样句中某个词的左、右邻接词的初始词向量输入所述连续词袋模型,预测得到该词的词向量x i,每次预测更新该训练样句的句子向量,直至预测得到该训练样句中每个词的词向量x i,以最后一次更新后的句子向量作为该训练样句的句子向量S i
  3. 根据权利要求2所述的企业关系提取方法,其特征在于,所述one-hot向量的方法是指把每个词表示为一个多维的向量,该向量的维度表示词的个数,其中只有一个维度的值为1,其余维度为0,该向量就代表当前词。
  4. 根据权利要求1所述的企业关系提取方法,其特征在于,所述拼接步 骤包括:
    从左向右根据当前词向量x i的前一个词向量x i-1的隐藏层状态向量h i-1计算当前词向量x i的第一隐藏层状态向量h i,并从右向左根据当前词向量x i的后一个词向量x i+1的隐藏层状态向量h i+1计算当前词向量x i的第二隐藏层状态向量h i’。
  5. 根据权利要求1或4所述的企业关系提取方法,其特征在于,所述拼接两个隐藏层状态向量是指利用Concatenate函数拼接训练样句中每个词的h i与h i’,得到该词的综合隐藏层状态向量。
  6. 根据权利要求1所述的企业关系提取方法,其特征在于,所述平均向量的表达式为:
    S=sum(a i*T i)/n
    其中a i代表训练样句的权重,T i代表每个训练样句的特征向量,n代表训练样句的数量。
  7. 根据权利要求6所述的企业关系提取方法,其特征在于,所述softmax分类函数的表达式为:
    Figure PCTCN2018076119-appb-100001
    其中K代表企业关系类型的个数,S代表需要预测企业关系类型的平均向量,
    Figure PCTCN2018076119-appb-100002
    代表某种企业关系类型,σ(z) j代表需要预测的企业关系类型在每个企业关系类型中的概率。
  8. 一种电子装置,其特征在于,所述装置包括:存储器、处理器,所述存储器上存储有企业关系提取程序,所述企业关系提取程序被所述处理器执行,可实现如下步骤:
    样本库建立步骤:从知识库中抽取存在关系的企业实体对句子作为训练样句建立样本库;
    分词步骤:从样本库中抽取包含一个企业实体对的所有训练样句,使用预设的分词工具对每个训练样句进行分词,将分词后的每个词映射成词向量x i,并将每个训练样句映射成句子向量S i,作为循环神经网络模型第一层的输入;
    拼接步骤:在循环神经网络模型的第二层,用长短期记忆模块从左向右计算当前词向量x i的第一隐藏层状态向量h i,并从右向左计算当前词向量x i的第二隐藏层状态向量h i’,通过拼接两个隐藏层状态向量得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
    计算步骤:在循环神经网络模型的第三层,根据每个训练样句的特征向量T i,利用平均向量表达式算出每个训练样句的平均向量S;
    权重确定步骤:在循环神经网络模型的最后一层,将所述平均向量S及所述企业实体对的关系类型代入softmax分类函数计算得到每个训练样句的权重a i
    预测步骤:从当前文本中提取包含两个企业实体的句子,经过双向长短期记忆模块得到句子的特征向量T i,将该特征向量T i输入上述训练好的循环神经网络模型,预测得到该两个企业实体间的关系。
  9. 根据权利要求8所述的电子装置,其特征在于,所述分词步骤包括:
    对分词后的每个词以one-hot向量的形式表示,得到初始词向量,并为每个训练样句标注句子ID,将句子ID映射为对应训练样句的初始句子向量,将该初始句子向量和该训练样句中某个词的左、右邻接词的初始词向量输入所述连续词袋模型,预测得到该词的词向量x i,每次预测更新该训练样句的句子向量,直至预测得到该训练样句中每个词的词向量x i,以最后一次更新后的句子向量作为该训练样句的句子向量S i
  10. 根据权利要求9所述的电子装置,其特征在于,所述one-hot向量的方法是指把每个词表示为一个多维的向量,该向量的维度表示词的个数,其中只有一个维度的值为1,其余维度为0,该向量就代表当前词。
  11. 根据权利要求8所述的电子装置,其特征在于,所述拼接步骤包括:
    从左向右根据当前词向量x i的前一个词向量x i-1的隐藏层状态向量h i-1计算当前词向量x i的第一隐藏层状态向量h i,并从右向左根据当前词向量x i的后一个词向量x i+1的隐藏层状态向量h i+1计算当前词向量x i的第二隐藏层状态向量h i’。
  12. 根据权利要求8或11所述的电子装置,其特征在于,所述拼接两个隐藏层状态向量是指利用Concatenate函数拼接训练样句中每个词的h i与h i’,得到该词的综合隐藏层状态向量。
  13. 根据权利要求8所述的电子装置,其特征在于,所述平均向量的表达式为:
    S=sum(a i*T i)/n
    其中a i代表训练样句的权重,T i代表每个训练样句的特征向量,n代表训练样句的数量。
  14. 根据权利要求13所述的电子装置,其特征在于,所述softmax分类函数的表达式为:
    Figure PCTCN2018076119-appb-100003
    其中K代表企业关系类型的个数,S代表需要预测企业关系类型的平均向量,
    Figure PCTCN2018076119-appb-100004
    代表某种企业关系类型,σ(z) j代表需要预测的企业关系类型在每个企业关系类型中的概率。
  15. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中包括企业关系提取程序,所述统企业关系提取程序被处理器执行时实现如下步骤:
    样本库建立步骤:从知识库中抽取存在关系的企业实体对句子作为训练样句建立样本库;
    分词步骤:从样本库中抽取包含一个企业实体对的所有训练样句,使用预设的分词工具对每个训练样句进行分词,将分词后的每个词映射成词向量x i,并将每个训练样句映射成句子向量S i,作为循环神经网络模型第一层的输入;
    拼接步骤:在循环神经网络模型的第二层,用长短期记忆模块从左向右计算当前词向量x i的第一隐藏层状态向量h i,并从右向左计算当前词向量x i的第二隐藏层状态向量h i’,通过拼接两个隐藏层状态向量得到训练样句中每个词的综合隐藏层状态向量,再根据训练样句中所有词的综合隐藏层状态向量得到每个训练样句的特征向量T i
    计算步骤:在循环神经网络模型的第三层,根据每个训练样句的特征向量T i,利用平均向量表达式算出每个训练样句的平均向量S;
    权重确定步骤:在循环神经网络模型的最后一层,将所述平均向量S及所述企业实体对的关系类型代入softmax分类函数计算得到每个训练样句的权重a i
    预测步骤:从当前文本中提取包含两个企业实体的句子,经过双向长短期记忆模块得到句子的特征向量T i,将该特征向量T i输入上述训练好的循环神经网络模型,预测得到该两个企业实体间的关系。
  16. 根据权利要求15所述的计算机可读存储介质,其特征在于,所述分词步骤包括:
    对分词后的每个词以one-hot向量的形式表示,得到初始词向量,并为每个训练样句标注句子ID,将句子ID映射为对应训练样句的初始句子向量,将该初始句子向量和该训练样句中某个词的左、右邻接词的初始词向量输入所述连续词袋模型,预测得到该词的词向量x i,每次预测更新该训练样句的句 子向量,直至预测得到该训练样句中每个词的词向量x i,以最后一次更新后的句子向量作为该训练样句的句子向量S i
  17. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述one-hot向量的方法是指把每个词表示为一个多维的向量,该向量的维度表示词的个数,其中只有一个维度的值为1,其余维度为0,该向量就代表当前词。
  18. 根据权利要求15所述的计算机可读存储介质,其特征在于,所述拼接步骤包括:
    从左向右根据当前词向量x i的前一个词向量x i-1的隐藏层状态向量h i-1计算当前词向量x i的第一隐藏层状态向量h i,并从右向左根据当前词向量x i的后一个词向量x i+1的隐藏层状态向量h i+1计算当前词向量x i的第二隐藏层状态向量h i’。
  19. 根据权利要求15所述的计算机可读存储介质,其特征在于,所述平均向量的表达式为:
    S=sum(a i*T i)/n
    其中a i代表训练样句的权重,T i代表每个训练样句的特征向量,n代表训练样句的数量。
  20. 根据权利要求19所述的计算机可读存储介质,其特征在于,所述softmax分类函数的表达式为:
    Figure PCTCN2018076119-appb-100005
    其中K代表企业关系类型的个数,S代表需要预测企业关系类型的平均向量,
    Figure PCTCN2018076119-appb-100006
    代表某种企业关系类型,σ(z) j代表需要预测的企业关系类型在每个企业关系类型中的概率。
PCT/CN2018/076119 2017-11-02 2018-02-10 企业关系提取方法、装置及存储介质 WO2019085328A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711061205.0 2017-11-02
CN201711061205.0A CN107943847B (zh) 2017-11-02 2017-11-02 企业关系提取方法、装置及存储介质

Publications (1)

Publication Number Publication Date
WO2019085328A1 true WO2019085328A1 (zh) 2019-05-09

Family

ID=61934111

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/076119 WO2019085328A1 (zh) 2017-11-02 2018-02-10 企业关系提取方法、装置及存储介质

Country Status (2)

Country Link
CN (1) CN107943847B (zh)
WO (1) WO2019085328A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619053A (zh) * 2019-09-18 2019-12-27 北京百度网讯科技有限公司 实体关系抽取模型的训练方法和抽取实体关系的方法
CN110879938A (zh) * 2019-11-14 2020-03-13 中国联合网络通信集团有限公司 文本情感分类方法、装置、设备和存储介质
CN111382843A (zh) * 2020-03-06 2020-07-07 浙江网商银行股份有限公司 企业上下游关系识别模型建立、关系挖掘的方法及装置

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108876044B (zh) * 2018-06-25 2021-02-26 中国人民大学 一种基于知识增强神经网络的线上内容流行度预测方法
CN108920587B (zh) * 2018-06-26 2021-09-24 清华大学 融合外部知识的开放域视觉问答方法及装置
CN108985501B (zh) * 2018-06-29 2022-04-29 平安科技(深圳)有限公司 基于指数特征提取的股指预测方法、服务器及存储介质
CN109243616A (zh) * 2018-06-29 2019-01-18 东华大学 基于深度学习的乳腺电子病历联合关系抽取与结构化系统
CN110737758B (zh) 2018-07-03 2022-07-05 百度在线网络技术(北京)有限公司 用于生成模型的方法和装置
CN109063032B (zh) * 2018-07-16 2020-09-11 清华大学 一种远程监督检索数据的降噪方法
CN109597851B (zh) * 2018-09-26 2023-03-21 创新先进技术有限公司 基于关联关系的特征提取方法和装置
CN109376250A (zh) * 2018-09-27 2019-02-22 中山大学 基于强化学习的实体关系联合抽取方法
CN109582956B (zh) * 2018-11-15 2022-11-11 中国人民解放军国防科技大学 应用于句子嵌入的文本表示方法和装置
CN109710768B (zh) * 2019-01-10 2020-07-28 西安交通大学 一种基于mimo递归神经网络的纳税人行业两层级分类方法
CN112036181A (zh) * 2019-05-14 2020-12-04 上海晶赞融宣科技有限公司 实体关系识别方法、装置及计算机可读存储介质
CN111950279B (zh) * 2019-05-17 2023-06-23 百度在线网络技术(北京)有限公司 实体关系的处理方法、装置、设备及计算机可读存储介质
CN110209836B (zh) * 2019-05-17 2022-04-26 北京邮电大学 远程监督关系抽取方法及装置
CN110188201A (zh) * 2019-05-27 2019-08-30 上海上湖信息技术有限公司 一种信息匹配方法及设备
CN110188202B (zh) * 2019-06-06 2021-07-20 北京百度网讯科技有限公司 语义关系识别模型的训练方法、装置及终端
CN110427624B (zh) * 2019-07-30 2023-04-25 北京百度网讯科技有限公司 实体关系抽取方法及装置
CN111476035B (zh) * 2020-05-06 2023-09-05 中国人民解放军国防科技大学 中文开放关系预测方法、装置、计算机设备和存储介质
CN111581387B (zh) * 2020-05-09 2022-10-11 电子科技大学 一种基于损失优化的实体关系联合抽取方法
CN111680127A (zh) * 2020-06-11 2020-09-18 暨南大学 一种面向年报的公司名称和关系抽取方法
CN111784488B (zh) * 2020-06-28 2023-08-01 中国工商银行股份有限公司 企业资金风险预测方法及装置
CN112215288B (zh) * 2020-10-13 2024-04-30 中国光大银行股份有限公司 目标企业的类别确定方法及装置、存储介质、电子装置
CN112418320B (zh) * 2020-11-24 2024-01-19 杭州未名信科科技有限公司 一种企业关联关系识别方法、装置及存储介质
CN113486630B (zh) * 2021-09-07 2021-11-19 浙江大学 一种供应链数据向量化和可视化处理方法及装置
CN113806538B (zh) * 2021-09-17 2023-08-22 平安银行股份有限公司 标签提取模型训练方法、装置、设备与存储介质
CN116562303B (zh) * 2023-07-04 2023-11-21 之江实验室 一种参考外部知识的指代消解方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160217393A1 (en) * 2013-09-12 2016-07-28 Hewlett-Packard Development Company, L.P. Information extraction
CN106372058A (zh) * 2016-08-29 2017-02-01 中译语通科技(北京)有限公司 一种基于深度学习的短文本情感要素抽取方法及装置
CN106855853A (zh) * 2016-12-28 2017-06-16 成都数联铭品科技有限公司 基于深度神经网络的实体关系抽取系统
CN107194422A (zh) * 2017-06-19 2017-09-22 中国人民解放军国防科学技术大学 一种结合正反向实例的卷积神经网络关系分类方法
CN107220237A (zh) * 2017-05-24 2017-09-29 南京大学 一种基于卷积神经网络的企业实体关系抽取的方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407211B (zh) * 2015-07-30 2019-08-06 富士通株式会社 对实体词的语义关系进行分类的方法和装置
CN106569998A (zh) * 2016-10-27 2017-04-19 浙江大学 一种基于Bi‑LSTM、CNN和CRF的文本命名实体识别方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160217393A1 (en) * 2013-09-12 2016-07-28 Hewlett-Packard Development Company, L.P. Information extraction
CN106372058A (zh) * 2016-08-29 2017-02-01 中译语通科技(北京)有限公司 一种基于深度学习的短文本情感要素抽取方法及装置
CN106855853A (zh) * 2016-12-28 2017-06-16 成都数联铭品科技有限公司 基于深度神经网络的实体关系抽取系统
CN107220237A (zh) * 2017-05-24 2017-09-29 南京大学 一种基于卷积神经网络的企业实体关系抽取的方法
CN107194422A (zh) * 2017-06-19 2017-09-22 中国人民解放军国防科学技术大学 一种结合正反向实例的卷积神经网络关系分类方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619053A (zh) * 2019-09-18 2019-12-27 北京百度网讯科技有限公司 实体关系抽取模型的训练方法和抽取实体关系的方法
CN110879938A (zh) * 2019-11-14 2020-03-13 中国联合网络通信集团有限公司 文本情感分类方法、装置、设备和存储介质
CN111382843A (zh) * 2020-03-06 2020-07-07 浙江网商银行股份有限公司 企业上下游关系识别模型建立、关系挖掘的方法及装置
CN111382843B (zh) * 2020-03-06 2023-10-20 浙江网商银行股份有限公司 企业上下游关系识别模型建立、关系挖掘的方法及装置

Also Published As

Publication number Publication date
CN107943847A (zh) 2018-04-20
CN107943847B (zh) 2019-05-17

Similar Documents

Publication Publication Date Title
WO2019085328A1 (zh) 企业关系提取方法、装置及存储介质
CN108804512B (zh) 文本分类模型的生成装置、方法及计算机可读存储介质
US10095780B2 (en) Automatically mining patterns for rule based data standardization systems
CN113051356B (zh) 开放关系抽取方法、装置、电子设备及存储介质
WO2021068329A1 (zh) 中文命名实体识别方法、装置及计算机可读存储介质
WO2017215370A1 (zh) 构建决策模型的方法、装置、计算机设备及存储设备
CN111709240A (zh) 实体关系抽取方法、装置、设备及其存储介质
WO2021051574A1 (zh) 英文文本序列标注方法、系统及计算机设备
CN111198948A (zh) 文本分类校正方法、装置、设备及计算机可读存储介质
WO2021174864A1 (zh) 基于少量训练样本的信息抽取方法及装置
WO2022174496A1 (zh) 基于生成模型的数据标注方法、装置、设备及存储介质
WO2023116561A1 (zh) 一种实体提取方法、装置、电子设备及存储介质
CN113360654B (zh) 文本分类方法、装置、电子设备及可读存储介质
CN112560504B (zh) 抽取表单文档中信息的方法、电子设备和计算机可读介质
CN111309834A (zh) 一种无线热点与兴趣点的匹配方法及装置
CN114398477A (zh) 基于知识图谱的政策推荐方法及其相关设备
CN113987125A (zh) 基于神经网络的文本结构化信息提取方法、及其相关设备
CN107943788B (zh) 企业简称生成方法、装置及存储介质
CN113704184A (zh) 一种文件分类方法、装置、介质及设备
CN112199954A (zh) 基于语音语义的疾病实体匹配方法、装置及计算机设备
CN115114408B (zh) 多模态情感分类方法、装置、设备及存储介质
CN116306656A (zh) 实体关系抽取方法、装置、设备及存储介质
WO2022105120A1 (zh) 图片文字检测方法、装置、计算机设备及存储介质
CN115525781A (zh) 多模态虚假信息检测方法、装置和设备
CN114297235A (zh) 风险地址识别方法、系统及电子设备

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 29/09/2020)

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 29/09/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18873729

Country of ref document: EP

Kind code of ref document: A1