WO2020140386A1 - 基于TextCNN知识抽取方法、装置、计算机设备及存储介质 - Google Patents

基于TextCNN知识抽取方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2020140386A1
WO2020140386A1 PCT/CN2019/089563 CN2019089563W WO2020140386A1 WO 2020140386 A1 WO2020140386 A1 WO 2020140386A1 CN 2019089563 W CN2019089563 W CN 2019089563W WO 2020140386 A1 WO2020140386 A1 WO 2020140386A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
neural network
convolutional neural
convolution
training data
Prior art date
Application number
PCT/CN2019/089563
Other languages
English (en)
French (fr)
Inventor
金戈
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Priority to SG11202001276TA priority Critical patent/SG11202001276TA/en
Priority to US16/635,554 priority patent/US11392838B2/en
Publication of WO2020140386A1 publication Critical patent/WO2020140386A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/2163Partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Definitions

  • This application relates to the field of knowledge graphs, and in particular to a method, device, computer equipment, and storage medium for knowledge extraction.
  • the intelligent customer service dialogue not only establishes a quick and effective means of communication between the enterprise and users, but also provides the statistical analysis information required by the enterprise for refined management.
  • Knowledge graph is an efficient form of information storage and reading, which can be effectively applied to customer service robot scenarios.
  • customer service robots can provide corresponding responses or services according to the chat content, improve user experience, and knowledge extraction, that is, extract knowledge from data from different sources and different structures to form knowledge (structured data) and store it in the knowledge graph.
  • Knowledge extraction is the most critical and most important step in the process of establishing a knowledge graph.
  • the existing knowledge extraction process is relatively cumbersome. If manual and entity information is imported, it will consume a lot of resources.
  • the purpose of this application is to provide a TextCNN-based knowledge extraction method, device, computer equipment and storage medium for solving the problems in the prior art.
  • this application provides a knowledge extraction method based on TextCNN, including the following steps:
  • S10 collects the first training data and constructs a word vector dictionary and a word vector dictionary
  • S20 constructs a first convolutional neural network, and trains the first convolutional neural network based on a first optimization algorithm;
  • the first convolutional neural network includes a first embedding layer, a first multi-layer convolution, and a A softmax function;
  • S21 collects second training data, the second training data is pre-labeled data, including a named entity position label and a named entity relationship label, the second training data is divided into a single text and special symbols are removed, and then input to the In the first embedded layer;
  • the first multi-layer convolution is used to perform a convolution operation on the matrix output from the first embedding layer.
  • the first multi-layer convolution includes a set of first-type one-dimensional convolution layers at the front and At least one set of second-type one-dimensional convolution layers at the rear, the first-type one-dimensional convolution layer includes one-dimensional convolution kernels with different lengths and the same number of channels, and the second-type one-dimensional convolution layer
  • the layer includes a one-dimensional convolution kernel with the same number of lengths and channels.
  • the data of each convolutional layer comes from the output of the previous convolutional layer, and the number of matrix rows is kept unchanged during the convolution operation;
  • S24 outputs the first multi-layer convolution through the first softmax function to determine the multiple subdivision categories BEMO of each text (where B entity starts, M entity middle, E entity ends, O non-entity) Prediction probability
  • S25 trains the first convolutional neural network, calculates the cross entropy of the loss function according to the prediction probability of the BEMO annotation and the true BEMO label of the second training data, and minimizes the loss function through the first optimization algorithm, To train the first convolutional neural network;
  • the second convolutional neural network includes a second embedding layer, a second multi-layer convolution, and a Pooling layer, two fully connected layers and the second softmax function;
  • S31 performs word segmentation on the second training data:
  • S311 uses the jieba library to perform preliminary word segmentation on the second training data, and corrects the prediction according to the prediction of the first convolutional neural network. If the preliminary word segmentation result is different from the predicted word segmentation result of the first convolutional neural network, Subject to the predicted word segmentation result of the first convolutional neural network;
  • the second multi-layer convolution includes a set of first-type one-dimensional convolution layers located at the front and located at At least a group of second-type one-dimensional convolutional layers at the rear, the first-type one-dimensional convolutional layer includes one-dimensional convolution kernels with different lengths and the same number of channels, and the second-type one-dimensional convolutional layer Including a one-dimensional convolution kernel with the same number of lengths and channels, the data of each convolutional layer comes from the output of the previous convolutional layer, and the number of matrix rows is kept unchanged during the convolution operation;
  • S37 train the second convolutional neural network, calculate the second loss function cross entropy according to the predicted probability of the relationship label output by the second convolutional neural network and the true relationship label of the second training data, and pass the optimization algorithm Minimizing the loss function to train the second convolutional neural network;
  • S40 input the data to be predicted into the first convolutional neural network and the second convolutional neural network after training, and according to the entities labeled after the training, the first convolutional neural network outputs the prediction and the training is described
  • the second convolutional neural network outputs the entity relationship prediction, and extracts the knowledge graph triplet of the data to be predicted: the class corresponding to the highest probability value in the BEMO annotation prediction probability is selected as the entity annotation prediction output by the first convolutional neural network , Select a class with a prediction probability value greater than 0.5 as the output entity relationship prediction of the second convolutional neural network to extract the knowledge graph triplet of the data to be predicted.
  • This application also provides a knowledge extraction device based on TextCNN, including:
  • Word vector dictionary construction module used to build a word vector dictionary based on the collected first training data
  • Word vector dictionary construction module used to build a word vector dictionary based on the collected first training data
  • a first convolutional neural network construction and training module for constructing a first convolutional neural network and training the first convolutional neural network based on a first optimization algorithm;
  • the first convolutional neural network includes sequentially connected first Embedding layer, first multi-layer convolution and first softmax function: including
  • the second training data is pre-annotated data, including a named entity position label and a named entity relationship label, and the word vector preprocessing module is used to divide the second training data into a single text and remove After the special symbol, enter it into the first embedding layer;
  • the word vector matrixing unit is used to perform word vector matching on the character-level second training data in the first embedding layer based on the word vector dictionary to convert the second training data into a matrix form;
  • a first multi-layer convolution unit is used to perform a convolution operation on the matrix output from the first embedding layer, and the first multi-layer convolution unit includes a set of first-type one-dimensional convolution layers at the front and At least one set of second-type one-dimensional convolution layers at the rear, the first-type one-dimensional convolution layer includes one-dimensional convolution kernels with different lengths and the same number of channels, and the second-type one-dimensional convolution layer The layer includes a one-dimensional convolution kernel with the same number of lengths and channels.
  • the data of each convolutional layer comes from the output of the previous convolutional layer, and the number of matrix rows is kept unchanged during the convolution operation;
  • a first softmax function output unit configured to output the first multi-layer convolution through the first softmax function, so as to determine the prediction probability of BEMO annotations of multiple subdivision categories of each text;
  • a first convolutional neural network training unit used to calculate the prediction probability of the BEMO and the true BEMO label of the second training data, calculate the cross entropy of the loss function, and minimize the loss function by the first optimization algorithm, To train the first convolutional neural network;
  • a second convolutional neural network construction and training module for constructing a second convolutional neural network and training the second convolutional neural network based on a second optimization algorithm includes a second The embedding layer, the second multi-layer convolution unit, a pooling layer, two fully connected layers and the second softmax function, including:
  • Word vector preprocessing unit used to segment the second training data, including:
  • the preliminary word segmentation subunit is used to perform preliminary word segmentation on the second training data using the jieba library, and to correct it according to the prediction of the first convolutional neural network, if the result of the preliminary word segmentation is different from that of the first convolutional neural network
  • the predicted word segmentation results are different, subject to the predicted word segmentation results of the first convolutional neural network
  • Word segmentation preprocessing subunit used to remove the special symbols and non-Chinese characters in the preliminary word segmentation, and then input the processed second training data to the second embedding layer;
  • the word vector matrixing unit is used to perform word vector matching on the second training data after word segmentation in the second embedding layer based on the word vector dictionary to convert the second training data into a matrix form;
  • a second multi-layer convolution unit for performing a convolution operation on the matrix output from the second embedding layer based on the second multi-layer convolution unit, the second multi-layer convolution unit including a A group of first-type one-dimensional convolutional layers and at least one group of second-type one-dimensional convolutional layers located at the rear.
  • the first-type one-dimensional convolutional layer includes one-dimensional convolution kernels with different lengths and the same number of channels
  • the second type of one-dimensional convolutional layer includes one-dimensional convolution kernels with the same number of lengths and channels, and the data of each convolutional layer comes from the output of the previous convolutional layer, and the convolution operation Keep the number of matrix rows unchanged during the process;
  • a pooling layer configured to input the output of the second multi-layer convolution unit to the pooling layer for compression
  • the fully connected layer is used to input the output of the pooling layer into two layers of the fully connected layer for information fusion of each channel;
  • a second softmax function output unit configured to input the output of the fully connected layer into the second softmax function, and determine corresponding prediction probabilities of multiple entity relationship labels
  • a second convolutional neural network unit for calculating the second loss function cross entropy based on the predicted probability of the relationship label output by the second convolutional neural network and the true relationship label of the second training data, and through an optimization algorithm Minimizing the loss function to train the second convolutional neural network;
  • the knowledge graph triad extraction module is used to input the data to be predicted into the first convolutional neural network and the second convolutional neural network after training, and the first convolutional neural network outputs after training Entity annotation prediction and the second convolutional neural network output entity relationship prediction after training, extract the knowledge graph triples of the data to be predicted: select the class corresponding to the highest probability value in the BEMO annotation prediction probability as the first
  • the entity labeling prediction output by the convolutional neural network selects the class with a prediction probability value greater than 0.5 as the second convolutional neural network outputting entity relationship prediction to extract the knowledge graph triplet of the data to be predicted.
  • the present application also provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor.
  • the processor implements the computer program to implement knowledge extraction based on TextCNN
  • the method includes the following steps:
  • S10 collects the first training data and constructs a word vector dictionary and a word vector dictionary
  • S20 constructs a first convolutional neural network, and trains the first convolutional neural network based on a first optimization algorithm;
  • the first convolutional neural network includes a first embedding layer, a first multi-layer convolution, and a A softmax function;
  • S21 collects second training data, the second training data is pre-labeled data, including a named entity position label and a named entity relationship label, the second training data is divided into a single text and special symbols are removed, and then input to the In the first embedded layer;
  • the first multi-layer convolution is used to perform a convolution operation on the matrix output from the first embedding layer.
  • the first multi-layer convolution includes a set of first-type one-dimensional convolution layers at the front and At least one set of second-type one-dimensional convolution layers at the rear, the first-type one-dimensional convolution layer includes one-dimensional convolution kernels with different lengths and the same number of channels, and the second-type one-dimensional convolution layer
  • the layer includes a one-dimensional convolution kernel with the same number of lengths and channels.
  • the data of each convolutional layer comes from the output of the previous convolutional layer, and the number of matrix rows is kept unchanged during the convolution operation;
  • S24 outputs the first multi-layer convolution through the first softmax function to determine the multiple subdivision categories BEMO of each text (where B entity starts, M entity middle, E entity ends, O non-entity) Prediction probability
  • S25 trains the first convolutional neural network, calculates the prediction probability of the BEMO and the true BEMO label of the second training data, calculates the cross entropy of the loss function, and minimizes the loss function by the first optimization algorithm, To train the first convolutional neural network;
  • S30 constructs a second convolutional neural network, and trains the second convolutional neural network based on a second optimization algorithm, the second convolutional neural network includes a second embedding layer, a second multi-layer convolution, and a Pooling layer, two fully connected layers and the second softmax function;
  • S31 performs word segmentation on the second training data:
  • S311 uses the jieba library to perform preliminary word segmentation on the second training data, and corrects the prediction according to the prediction of the first convolutional neural network. If the preliminary word segmentation result is different from the predicted word segmentation result of the first convolutional neural network, Subject to the predicted word segmentation result of the first convolutional neural network;
  • the second multi-layer convolution includes a set of first-type one-dimensional convolution layers located at the front and located at At least a group of second-type one-dimensional convolutional layers at the rear, the first-type one-dimensional convolutional layer includes one-dimensional convolution kernels with different lengths and the same number of channels, and the second-type one-dimensional convolutional layer Including a one-dimensional convolution kernel with the same number of lengths and channels, the data of each convolutional layer comes from the output of the previous convolutional layer, and the number of matrix rows is kept unchanged during the convolution operation;
  • S37 train the second convolutional neural network, calculate the second loss function cross entropy according to the predicted probability of the relationship label output by the second convolutional neural network and the true relationship label of the second training data, and pass the optimization algorithm Minimizing the loss function to train the second convolutional neural network;
  • S40 input the data to be predicted into the first convolutional neural network and the second convolutional neural network after training, and according to the entities labeled after the training, the first convolutional neural network outputs the prediction and the training is described
  • the second convolutional neural network outputs the entity relationship prediction, and extracts the knowledge graph triplet of the data to be predicted: the class corresponding to the highest probability value in the BEMO annotation prediction probability is selected as the entity annotation prediction output by the first convolutional neural network , Select a class with a prediction probability value greater than 0.5 as the output entity relationship prediction of the second convolutional neural network to extract the knowledge graph triplet of the data to be predicted.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, a knowledge extraction method based on TextCNN is implemented, including the following steps:
  • S10 collects the first training data and constructs a word vector dictionary and a word vector dictionary
  • S20 constructs a first convolutional neural network, and trains the first convolutional neural network based on a first optimization algorithm;
  • the first convolutional neural network includes a first embedding layer, a first multi-layer convolution, and a A softmax function;
  • S21 collects second training data, the second training data is pre-labeled data, including a named entity position label and a named entity relationship label, the second training data is divided into a single text and special symbols are removed, and then input to the In the first embedded layer;
  • the first multi-layer convolution is used to perform a convolution operation on the matrix output from the first embedding layer.
  • the first multi-layer convolution includes a set of first-type one-dimensional convolution layers at the front and At least one set of second-type one-dimensional convolution layers at the rear, the first-type one-dimensional convolution layer includes one-dimensional convolution kernels with different lengths and the same number of channels, and the second-type one-dimensional convolution layer
  • the layer includes a one-dimensional convolution kernel with the same number of lengths and channels.
  • the data of each convolutional layer comes from the output of the previous convolutional layer, and the number of matrix rows is kept unchanged during the convolution operation;
  • S24 outputs the first multi-layer convolution through the first softmax function to determine the multiple subdivision categories BEMO of each text (where B entity starts, M entity middle, E entity ends, O non-entity) Prediction probability
  • S25 trains the first convolutional neural network, calculates the prediction probability of the BEMO and the true BEMO label of the second training data, calculates the cross entropy of the loss function, and minimizes the loss function by the first optimization algorithm, To train the first convolutional neural network;
  • S30 constructs a second convolutional neural network, and trains the second convolutional neural network based on a second optimization algorithm, the second convolutional neural network includes a second embedding layer, a second multi-layer convolution, and a Pooling layer, two fully connected layers and the second softmax function;
  • S31 performs word segmentation on the second training data:
  • S311 uses the jieba library to perform preliminary word segmentation on the second training data, and corrects the prediction according to the prediction of the first convolutional neural network. If the preliminary word segmentation result is different from the predicted word segmentation result of the first convolutional neural network, Subject to the predicted word segmentation result of the first convolutional neural network;
  • the second multi-layer convolution includes a set of first-type one-dimensional convolution layers located at the front and located at At least a group of second-type one-dimensional convolutional layers at the rear, the first-type one-dimensional convolutional layer includes one-dimensional convolution kernels with different lengths and the same number of channels, and the second-type one-dimensional convolutional layer Including a one-dimensional convolution kernel with the same number of lengths and channels, the data of each convolutional layer comes from the output of the previous convolutional layer, and the number of matrix rows is kept unchanged during the convolution operation;
  • S37 train the second convolutional neural network, calculate the second loss function cross entropy according to the predicted probability of the relationship label output by the second convolutional neural network and the true relationship label of the second training data, and pass the optimization algorithm Minimizing the loss function to train the second convolutional neural network;
  • S40 input the data to be predicted into the first convolutional neural network and the second convolutional neural network after training, and according to the entities labeled after the training, the first convolutional neural network outputs the prediction and the training is described
  • the second convolutional neural network outputs the entity relationship prediction, and extracts the knowledge graph triplet of the data to be predicted: the class corresponding to the highest probability value in the BEMO annotation prediction probability is selected as the entity annotation prediction output by the first convolutional neural network , Select a class with a prediction probability value greater than 0.5 as the output entity relationship prediction of the second convolutional neural network to extract the knowledge graph triplet of the data to be predicted.
  • This application provides a TextCNN-based knowledge extraction method, device, computer equipment, and storage medium.
  • the convolutional neural network is used to realize the knowledge extraction link in the knowledge graph, which effectively improves the model training efficiency under the premise of ensuring accuracy.
  • the two types of trained convolutional neural network models realize automatic knowledge extraction through prediction fusion.
  • the first convolutional neural network is used to implement the named entity Identify.
  • the convolutional neural network used is in the form of full convolution, the input is a word vector, and the output is the entity category boundary prediction.
  • the original continuous text can be segmented, and the related text of the named entity can be retained and the entity can be classified; the second convolutional neural network realizes knowledge extraction.
  • the convolutional neural network used includes convolutional layers, pooling layers, etc. Its input includes word vectors and word vectors, and its output is relationship extraction and recognition. After this process, the association of knowledge entities in the text can be determined.
  • FIG. 1 is a flowchart of an embodiment of a method for extracting TextCNN knowledge based on the application
  • FIG. 2 is a schematic diagram of a program module of an embodiment of an application based on TextCNN knowledge extraction device
  • FIG. 3 is a schematic diagram of a hardware structure of an embodiment of an apparatus for extracting knowledge based on TextCNN of the present application.
  • This application provides a TextCNN-based knowledge extraction method, as shown in Figure 1, including the following steps:
  • S10 collects the first training data and constructs a word vector dictionary and a word vector dictionary
  • step S10 includes:
  • S11 divides the collected first training data into individual text and removes special symbols and non-Chinese characters, substitutes the Word2Vec algorithm for training, obtains word vectors, and builds a word vector dictionary;
  • the first training data collected is used for word segmentation and special symbols and non-Chinese characters are removed, and then substituted into the Word2Vec algorithm for training to obtain a word vector dictionary.
  • Word2Vec algorithm training is implemented through the gensim library in Python.
  • the knowledge extraction method based on TextCNN shown in this application first obtains a dictionary of words and word vectors, that is, determines the correspondence between words and vectors. Words and word vectors are constructed separately, and the first training text is Chinese Wikipedia. For word vectors, first divide the training text into individual text and remove special symbols and non-Chinese characters, and then substitute the processed text into the Word2Vec algorithm for training to obtain the word vector. For word vectors, the training text is first segmented to remove special symbols and non-Chinese characters, and then the processed text is substituted into the Word2Vec algorithm for training to obtain word vectors. The dimension of word vector and word vector is 300. The word segmentation involved in this step is implemented through the jieba library in Python, and the Word2Vec algorithm training involved in this step is implemented through the gensim library in Python.
  • S20 constructs a first convolutional neural network, and trains the first convolutional neural network based on the first optimization algorithm;
  • the first convolutional neural network includes a first embedding layer, a first multi-layer convolution, and a first softmax function connected in sequence;
  • the first convolutional neural network can be established based on the tensorflow library in Python.
  • S21 collects second training data.
  • the second training data is pre-annotated data, including named entity position labels and named entity relationship labels. After the second training data is divided into single text and special symbols are removed, the second training data is input into the first embedding layer;
  • the first multi-layer convolution is used to perform a convolution operation on the matrix output by the first embedding layer.
  • the first multi-layer convolution includes a set of first-type one-dimensional convolution layers at the front and at least one at the back A second type of one-dimensional convolutional layer.
  • the first type of one-dimensional convolutional layer includes one-dimensional convolution kernels with different lengths and the same number of channels.
  • the second type of one-dimensional convolutional layer includes the same length and channel number.
  • One-dimensional convolution kernel the data of each convolutional layer comes from the output of the previous convolutional layer, and the number of matrix rows is kept unchanged during the convolution operation;
  • the first multi-layer convolution includes 5 convolution layers, and the first type of one-dimensional convolution layer includes three types of one-dimensional convolution kernels, each corresponding to 128 channels; four groups
  • the second-type one-dimensional convolutional layer has a one-dimensional convolution kernel length of 3 and a channel number of 384;
  • S24 outputs the first multi-layer convolution through the first softmax function to determine the prediction probability of BEMO annotations of multiple subdivision categories of each text;
  • S25 trains the first convolutional neural network, calculates the prediction probability according to BEMO and the true BEMO label of the second training data, calculates the cross entropy of the loss function, and minimizes the loss function through the first optimization algorithm to train the first convolutional neural network ;
  • the first optimization algorithm is the ADAM algorithm or the rmsprop algorithm.
  • the second training text is different from the first training text of words and word vectors, which is in the form of short sentences and includes a named entity position label (BMEO Chinese named entity boundary annotation for each text) and named entity relationship label.
  • BMEO Chinese named entity boundary annotation for each text BMEO Chinese named entity boundary annotation for each text
  • named entity relationship label BMEO Chinese named entity boundary annotation for each text
  • the second training text is divided into single text and special symbols are removed, and the processed text is input into the first convolutional neural network.
  • the first convolutional neural network performs character vector matching on the character-level second training text in the embedding layer, thereby converting the second training text into a matrix form (each row of the matrix corresponds to a vector of characters).
  • the first convolutional neural network can perform convolution operations.
  • the first multi-layer convolution has a total of 5 convolutional layers, and the data of each convolutional layer comes from the previous A convolutional layer output.
  • the first-type one-dimensional convolutional layer at the first layer includes three types of one-dimensional convolution kernels of length (1, 3, 5) corresponding to 128 channels, while the remaining first-type one-dimensional convolutional layers
  • the length of the dimension convolution kernel is 3 and the number of channels is 384. It should be noted that the number of rows of the matrix needs to be kept constant during the operation of the convolutional layer.
  • the final convolutional layer of the first multi-layer convolution is output through the first softmax function, and this output corresponds to the BEMO label prediction probability of each text.
  • BEMO labeling includes multiple types of subdivisions, so the first convolutional neural network will calculate the probability of a text for each subdivision category, such as the probability of "B_place name", the probability of "B_person name”, "E_ ⁇ ” Probability.
  • the algorithm will calculate the loss function cross-entropy according to the BEMO label prediction probability of the first convolutional neural network and the true BEMO label of the second training text, and minimize the loss function through the optimization algorithm ADAM to train the neural network.
  • this model may have the phenomenon of contradictions in text label prediction, so the model only extracts the entities corresponding to the text labels before and after.
  • the establishment of the first convolutional neural network is achieved through the tensorflow library in Python. So far, the program has completed the construction and training of the first convolutional neural network.
  • S30 constructs a second convolutional neural network, and trains the second convolutional neural network based on the second optimization algorithm.
  • the second convolutional neural network includes a second embedding layer, a second multi-layer convolution, and a pooling layer connected in sequence. Two fully connected layers and a second softmax function; in this embodiment, a second convolutional neural network can be established based on the tensorflow library in Python.
  • S31 performs word segmentation on the second training data:
  • S311 uses the jieba library to perform preliminary word segmentation on the second training data, and corrects it according to the prediction of the first convolutional neural network. If the preliminary word segmentation result is different from the predicted word segmentation result of the first convolutional neural network, the first convolutional neural network is used. The predicted word segmentation result of the network shall prevail;
  • S32 performs word vector matching on the second training data after word segmentation at the second embedding layer based on the word vector dictionary to convert the second training data into a matrix form
  • the S33 performs a convolution operation on the matrix output from the second embedding layer based on the second multi-layer convolution.
  • the second multi-layer convolution includes a set of first-type one-dimensional convolution layers at the front and at least one set at the back.
  • the second type of one-dimensional convolutional layer includes one-dimensional convolution kernels with different lengths and the same number of channels.
  • the second type of one-dimensional convolutional layer includes the same length and channel number.
  • the second multi-layer Convolution includes 3 layers of convolutional layers, the first type of one-dimensional convolutional layer includes three types of one-dimensional convolution kernels, each corresponding to 128 channels; two groups of second-dimensional one-dimensional convolutional layers of one-dimensional convolution The core length is 3 and the number of channels is 384.
  • S34 inputs the output of the second multi-layer convolution to the pooling layer for compression
  • S35 inputs the output of the pooling layer into two fully connected layers for information fusion of each channel
  • S36 inputs the output of the fully connected layer into the second softmax function, which is used to determine the corresponding prediction probabilities of multiple entity relationship labels,
  • S37 trains the second convolutional neural network, calculates the second loss function cross-entropy according to the predicted probability of the relationship label output by the second convolutional neural network and the true relationship label of the second training data, and minimizes the loss function through an optimization algorithm To train the second convolutional neural network; in this embodiment, the second optimization algorithm is the ADAM algorithm or the rmsprop algorithm.
  • the training text adopted for the construction of the second convolutional neural network is the same as the second convolutional neural network, but the label used is not a BMEO label, but a named entity relationship label.
  • the scheme uses the jieba library to perform preliminary word segmentation on the text and based on the recognition results of the first convolutional neural network. It is amended and finally removes special symbols and non-Chinese characters.
  • the second convolutional neural network can be input, and word vector matching is performed on the text at the embedding layer (named entity vectors that do not exist in the word vector dictionary are initialized to 0).
  • the second multi-layer convolution contains the first three convolution layers.
  • the first type of one-dimensional convolution layer on the first layer includes three types of one-dimensional convolution kernels of length (1, 3, 5) corresponding to 128 channels, while the remaining one-dimensional convolution kernel lengths Both are 3 and the number of channels is 384. It should be noted that the number of matrix rows must be kept constant during the operation of the convolutional layer.
  • the output of the convolution operation compresses the input pooling layer, the form of pooling is Max-Pooling, and the output of the pooling layer will input two fully connected layers to achieve the information fusion of each channel.
  • the fully connected layer outputs the corresponding predicted probabilities of multiple tags in the form of a second softmax function, such as "geographic relationship" probability, "subordinate relationship” probability, and other types of tags.
  • the algorithm will calculate the loss function cross-entropy based on the model's relationship prediction and the true relationship label, and minimize the loss function through the optimization algorithm ADAM to train the neural network.
  • the establishment of the second convolutional neural network is achieved through the tensorflow library in Python. So far, the scheme has completed the second convolutional neural network and training.
  • step S40 input the data to be predicted into the trained first convolutional neural network and the second convolutional neural network, and label the prediction and the second convolutional neural network output entity after training according to the entity output from the first convolutional neural network after training Relational prediction, extract the knowledge graph triples of the data to be predicted: select the class corresponding to the highest probability value in the BEMO label prediction probability as the entity label prediction output by the first convolutional neural network, and select the class with the prediction probability value greater than 0.5 as the second Convolutional neural networks output entity relationship predictions to extract knowledge graph triples of data to be predicted.
  • step S40 if the entity label prediction and the entity relationship prediction result contradict each other, the knowledge graph triplet extraction of the data to be predicted is abandoned.
  • the prediction results of two types of convolutional neural networks are fused. Because the prediction results of the two types of convolutional neural networks may be inconsistent, for example, for a sentence, the first convolutional neural network prediction includes person entities, but the second convolutional neural network prediction belongs to "geographic relationship.” Therefore, the scheme only extracts the knowledge corresponding to the prediction results of the two types of models. For example, for a sentence, the first convolutional neural network prediction includes person entities, and the second convolutional neural network prediction belongs to "subordination", then this knowledge graph triplet is extracted.
  • this application provides a TextCNN-based knowledge extraction method that uses a convolutional neural network to implement knowledge extraction in the knowledge graph, which effectively improves model training efficiency while ensuring accuracy.
  • a convolutional neural network uses a convolutional neural network to implement knowledge extraction in the knowledge graph, which effectively improves model training efficiency while ensuring accuracy.
  • the convolutional layer forms are one-dimensional convolution kernels
  • entity recognition and entity relationship recognition are realized respectively.
  • the convolutional neural network has parallel computing characteristics, it can make full use of computing resources to improve computing efficiency.
  • the two types of trained convolutional neural network models realize automatic knowledge extraction through prediction fusion.
  • the first convolutional neural network is used to implement the named entity Identify.
  • the convolutional neural network used is in the form of full convolution, the input is a word vector, and the output is the entity category boundary prediction.
  • the original continuous text can be segmented, and the related text of the named entity can be retained and the entity can be classified; the second convolutional neural network realizes knowledge extraction.
  • the convolutional neural network used includes convolutional layers, pooling layers, etc. Its input includes word vectors and word vectors, and its output is relationship extraction and recognition. After this process, the association of knowledge entities in the text can be determined.
  • this application shows a TextCNN-based knowledge extraction device 10, based on the first embodiment, to implement the first embodiment of the TextCNN-based knowledge extraction method, which includes the functions of each program module :
  • the TextCNN-based knowledge extraction device 10 may include or be divided into one or more program modules, one or more program modules are stored in a storage medium, and executed by one or more processors To complete this application and to implement the above knowledge extraction method based on TextCNN.
  • the program module referred to in this application refers to a series of computer program instruction segments that can perform specific functions, and is more suitable than the program itself to describe the execution process of the TextCNN-based knowledge extraction device in the storage medium. The following description will specifically introduce the functions of the program modules of this embodiment:
  • This application also provides a TextCNN-based knowledge extraction device 10, including:
  • Word vector dictionary construction module 11 used to build a word vector dictionary based on the collected first training data
  • Word vector dictionary building module 12 used to build a word vector dictionary based on the collected first training data
  • the first convolutional neural network construction and training module 13 is used to construct a first convolutional neural network and train the first convolutional neural network based on the first optimization algorithm;
  • the first convolutional neural network includes a first embedding layer connected in sequence,
  • the first multi-layer convolution and the first softmax function including
  • the second training data is pre-labeled data, including named entity position labels and named entity relationship labels.
  • the word vector preprocessing module is used to split the second training data into a single text and remove special symbols, then input to In the first embedded layer;
  • the word vector matrixing unit is used to perform word vector matching on the character-level second training data in the first embedding layer based on the word vector dictionary to convert the second training data into a matrix form;
  • the first multi-layer convolution unit is used to perform a convolution operation on the matrix output by the first embedding layer.
  • the first multi-layer convolution unit includes a group of first-type one-dimensional convolution layers at the front and a At least one set of second-type one-dimensional convolutional layers.
  • the first type of one-dimensional convolutional layer includes different lengths and the same number of channels.
  • One-dimensional convolution kernel is set.
  • the second type of one-dimensional convolutional layer includes both length and number of channels.
  • the first softmax function output unit is used to output the first multi-layer convolution through the first softmax function to determine the prediction probability of BEMO labeling of multiple subdivision categories of each text;
  • the first convolutional neural network training unit is used to mark the predicted probability according to BEMO and the true BEMO label of the second training data, calculate the cross entropy of the loss function, and minimize the loss function through the first optimization algorithm to train the first convolution Neural Networks;
  • the second convolutional neural network construction and training module 14 is used to construct a second convolutional neural network and train a second convolutional neural network based on a second optimization algorithm.
  • the second convolutional neural network includes a second embedding layer connected in sequence,
  • the second multi-layer convolution unit, a pooling layer, two fully connected layers and the second softmax function include:
  • Word vector preprocessing unit used to segment the second training data, including:
  • the preliminary word segmentation subunit is used to perform preliminary word segmentation on the second training data using the jieba library, and to correct it according to the prediction of the first convolutional neural network. If the preliminary word segmentation result is different from the predicted word segmentation result of the first convolutional neural network Based on the predicted word segmentation results of the first convolutional neural network;
  • Word segmentation preprocessing subunit used to remove the special symbols and non-Chinese characters in the preliminary word segmentation, and then input the processed second training data to the second embedding layer;
  • the word vector matrixing unit is used to perform word vector matching on the second training data after word segmentation in the second embedding layer based on the word vector dictionary to convert the second training data into a matrix form;
  • the second multi-layer convolution unit is used to perform a convolution operation on the matrix output by the second embedding layer based on the second multi-layer convolution unit.
  • the second multi-layer convolution unit includes a set of first-type one-dimensional Convolutional layer and at least a group of second-type one-dimensional convolutional layers located at the rear.
  • the first-type one-dimensional convolutional layer includes one-dimensional convolution kernels with different lengths and the same number of channels.
  • the second-type one-dimensional convolutional layer The layer includes a one-dimensional convolution kernel with the same number of lengths and channels. The data of each convolution layer comes from the output of the previous convolution layer, and the number of matrix rows is kept unchanged during the convolution operation;
  • the pooling layer is used to input the output of the second multi-layer convolution unit to the pooling layer for compression
  • Fully connected layer used to input the output of the pooling layer into two fully connected layers for information fusion of each channel
  • the second softmax function output unit is used to input the output of the fully connected layer into the second softmax function to determine the corresponding prediction probability of multiple entity relationship labels
  • the knowledge graph triplet extraction module 15 is used to input the data to be predicted into the trained first convolutional neural network and the second convolutional neural network, and the predicted and After training, the second convolutional neural network outputs the entity relationship prediction, and extracts the knowledge graph triplet of the data to be predicted: the class corresponding to the highest probability value in the BEMO label prediction probability is selected as the entity label prediction output by the first convolutional neural network.
  • the class with the prediction probability value greater than 0.5 is used as the second convolutional neural network to output the entity relationship prediction to extract the knowledge graph triplet of the data to be predicted.
  • the word vector dictionary construction module 11 after dividing the collected first training data into individual words and removing special symbols and non-Chinese characters, it is substituted into the Word2Vec algorithm for training, obtaining word vectors and building a word vector dictionary;
  • the collected first training data is subjected to word segmentation and special symbols and non-Chinese characters are removed, and then substituted into the Word2Vec algorithm for training to obtain a word vector dictionary.
  • Word2Vec algorithm training is implemented through the gensim library in Python.
  • the first convolutional neural network and the second convolutional neural network are established based on the tensorflow library in Python.
  • the first multi-layer convolution includes 5 convolution layers
  • the first type one-dimensional convolution layer includes three types of one-dimensional convolution kernels, each corresponding to 128 channels; four groups of second-type one-dimensional convolution
  • the length of the one-dimensional convolution kernel of the layer is 3 and the number of channels is 384;
  • the second multi-layer convolution includes 3 layers of convolution layers
  • the first type of one-dimensional convolution layer includes three types of one-dimensional convolution kernels, each corresponding to 128 channels; two groups of second-type one-dimensional
  • the length of the one-dimensional convolution kernel of the convolutional layer is 3 and the number of channels is 384.
  • the first optimization algorithm and the second optimization algorithm are the ADAM algorithm or the rmsprop algorithm.
  • the knowledge graph triad extraction module 15 if the entity annotation prediction and the entity relationship prediction result are in conflict with each other, the knowledge graph triad extraction of the data to be predicted is abandoned.
  • This application provides a TextCNN-based knowledge extraction device as shown in this application, which implements the knowledge extraction link in the knowledge graph through a convolutional neural network, which effectively improves model training efficiency while ensuring accuracy.
  • a convolutional neural network By transforming the training text into a vector form, and accessing two types of convolutional neural network models (the convolutional layer forms are one-dimensional convolution kernels) to refine the training text information, named entity recognition and entity relationship recognition are realized respectively.
  • the convolutional neural network has parallel computing characteristics, it can make full use of computing resources to improve computing efficiency.
  • the two types of trained convolutional neural network models realize automatic knowledge extraction through prediction fusion.
  • the first convolutional neural network is used to implement the named entity Identify.
  • the convolutional neural network used is in the form of full convolution, the input is a word vector, and the output is the entity category boundary prediction.
  • the original continuous text can be segmented, and the related text of the named entity can be retained and the entity can be classified; the second convolutional neural network realizes knowledge extraction.
  • the convolutional neural network used includes convolutional layers, pooling layers, etc. Its input includes word vectors and word vectors, and its output is relationship extraction and recognition. After this process, the association of knowledge entities in the text can be determined.
  • This application also provides a computer device, such as a smartphone, tablet computer, notebook computer, desktop computer, rack server, blade server, tower server or rack server (including a stand-alone server, or multiple Server cluster composed of servers).
  • the computer device 20 of this embodiment includes at least but not limited to: a memory 21 and a processor 22 that can be communicatively connected to each other through a system bus, as shown in FIG. 3. It should be noted that FIG. 3 only shows the computer device 20 having the components 21-22, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • the memory 21 (ie, readable storage medium) includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), Read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, etc.
  • the memory 21 may be an internal storage unit of the computer device 20, such as a hard disk or memory of the computer device 20.
  • the memory 21 may also be an external storage device of the computer device 20, for example, a plug-in hard disk equipped on the computer device 20, a smart memory card (Smart Media, Card, SMC), and secure digital (Secure Digital, SD) card, flash card (Flash Card), etc.
  • the memory 21 may also include both the internal storage unit of the computer device 20 and its external storage device.
  • the memory 21 is generally used to store the operating system and various application software installed in the computer device 20, such as the program code of the TEXTCNN knowledge extraction device 10 according to the first embodiment.
  • the memory 21 can also be used to temporarily store various types of data that have been output or are to be output.
  • the processor 22 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments.
  • the processor 22 is generally used to control the overall operation of the computer device 20.
  • the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the TEXTCNN-based knowledge extraction device 10 to implement the TEXTCNN-based knowledge extraction method of Embodiment 1.
  • This application also provides a computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Programmable Read-Only Memory (PROM), Magnetic Memory, Magnetic Disk, CD-ROM, Server, App Store, etc., on which computer programs, programs are stored The corresponding function is realized when executed by the processor.
  • the computer-readable storage medium of this embodiment is used to store the TEXTCNN-based knowledge extraction device 10, and when executed by a processor, implements the TEXTCNN-based knowledge extraction method of Embodiment 1.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)
  • Image Analysis (AREA)

Abstract

一种基于TextCNN的知识抽取方法,包括:S10:构建字向量字典与词向量字典;S20:构建第一卷积神经网络,并基于第一优化算法训练第一卷积神经网络;第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数;S30:构建第二卷积神经网络,并基于第二优化算法训练第二卷积神经网络,第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积、一池化层、两层全连接层以及第二softmax函数;S40:依据训练后第一卷积神经网络输出的实体标注预测与训练后第二卷积神经网络输出实体关系预测,提取待预测数据的知识图谱三元组,由于卷积神经网络具有并行计算特性,因此能够充分利用计算资源实现计算效率提升。

Description

基于TextCNN知识抽取方法、装置、计算机设备及存储介质
相关申请的交叉引用
本申请申明享有2019年01月02日递交的申请号为CN 2019100026381、名称为“基于TextCNN知识抽取方法、装置、计算机设备及存储介质”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
技术领域
本申请涉及知识图谱领域,尤其涉及一种知识抽取的方法、装置、计算机设备及存储介质。
背景技术
近年来,智能客服机器人在国内外的发展应用近几年呈现出快速增长的势头,并逐步开始在电信运营商、金融服务等行业形成产业规模。智能客服对话不仅为企业与用户建立了快捷有效的沟通手段,还为企业提供了精细化管理所需的统计分析信息。
知识图谱是一种高效的信息存储与读取形式,能够有效应用于客服机器人场景。通过知识图谱,客服机器人能够根据聊天内容提供相应回应或者服务,提升用户体验,知识抽取,即从不同来源、不同结构的数据中进行知识提取,形成知识(结构化数据)存入到知识图谱,知识抽取是知识图谱建立过程中最关键且最主要的一步,但是,现有的知识抽取的建立过程较为繁琐,如果通过人工形式导入实体、关系信息,则需要耗费大量资源。
发明内容
本申请的目的是提供一种基于TextCNN知识抽取方法、装置、计算机设备及存储介质,用于解决现有技术存在的问题。
为实现上述目的,本申请提供一种基于TextCNN知识抽取方法,包括以下步骤:
S10收集第一训练数据,构建字向量字典与词向量字典;
S20构建第一卷积神经网络,并基于第一优化算法训练所述第一卷积神经网络;所述第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数;
S21收集第二训练数据,所述第二训练数据为预先标注数据,包括命名实体位置标签与命名实体关系标签,将所述第二训练数据分割为单一文字并去除特殊符号后,输入至所述第一嵌入层中;
S22基于所述字向量字典,于所述第一嵌入层将字符级的第二训练数据进行字向量匹配,以将所述第二训练数据转化为矩阵形式;
S23所述第一多层卷积用于对所述第一嵌入层输出的矩阵进行卷积运算,所述第一多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中,保持矩阵行数不变;
S24将所述第一多层卷积经由第一softmax函数进行输出,以确定每一文字的多个细分类别BEMO(其中,B实体开端,M实体中部,E实体结尾,O非实体)标注的预测概率;
S25训练所述第一卷积神经网络,根据所述BEMO标注预测概率与所述第二训练数据真实BEMO标签,计算损失函数交叉熵,并通过第一优化算法对所述损失函数进行最小化,以训练所述第一卷积神经网络;
S30构建第二卷积神经网络,并基于第二优化算法训练所述第二卷积神经网络,所述第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积、一池化层、两层全连接层 以及第二softmax函数;
S31对所述第二训练数据进行分词:
S311采用jieba库对所述第二训练数据进行初步分词,并比照第一卷积神经网络的预测进行修正,若所述初步分词结果与所述第一卷积神经网络的预测的分词结果不同,以所述第一卷积神经网络的预测的分词结果为准;
S312去除所述初步分词中的特殊符号及非中文字符后,将所述处理后的第二训练数据输入至所述第二嵌入层;
S32基于所述词向量词典,于所述第二嵌入层对所述分词后的第二训练数据进行词向量匹配,以将所述第二训练数据转化为矩阵形式;
S33基于所述第二多层卷积对所述第二嵌入层输出的矩阵进行卷积运算,所述第二多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中保持矩阵行数不变;
S34将所述第二多层卷积的输出输入至所述池化层进行压缩;
S35将所述池化层的输出输入至两层所述全连接层中进行各个通道的信息融合;
S36将所述全连接层的输出输入至所述第二softmax函数中,用于确定多个实体关系标签的对应预测概率,
S37训练所述第二卷积神经网络,依据所述第二卷积神经网络输出的关系标签的预测概率与所述第二训练数据真实关系标签,计算第二损失函数交叉熵,并通过优化算法对所述损失函数进行最小化,以训练所述第二卷积神经网络;
S40将待预测数据输入至训练完毕的所述第一卷积神经网络与所述第二卷积神经网络中,依据训练后所述第一卷积神经网络输出的实体标注预测与训练后所述第二卷积神经网络输出实体关系预测,提取待预测数据的知识图谱三元组:选取所述BEMO标注预测概率中最高概率值对应的类作为所述第一卷积神经网络输出的实体标注预测,选取预测概率值大于0.5的类作为所述第二卷积神经网络输出实体关系预测,以提取将待预测数据的知识图谱三元组。
本申请还提供了一种基于TextCNN的知识抽取装置,包括:
字向量字典构建模块:用于基于收集的第一训练数据,构建字向量字典;
词向量字典构建模块:用于基于收集的第一训练数据,构建词向量字典;
第一卷积神经网络构建及训练模块,用于构建第一卷积神经网络并基于第一优化算法训练所述第一卷积神经网络;所述第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数:包括
字向量预处理单元,所述第二训练数据为预先标注数据,包括命名实体位置标签与命名实体关系标签,所述字向量预处理模块用于将所述第二训练数据分割为单一文字并去除特殊符号后,输入至所述第一嵌入层中;
字向量矩阵化单元,用于基于所述字向量字典,于所述第一嵌入层将字符级的第二训练数据进行字向量匹配,以将所述第二训练数据转化为矩阵形式;
第一多层卷积单元,用于对所述第一嵌入层输出的矩阵进行卷积运算,所述第一多层卷积单元包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中,保持矩阵行数不变;
第一softmax函数输出单元,用于将所述第一多层卷积经由第一softmax函数进行输出,以确定每一文字的多个细分类别BEMO标注的预测概率;
第一卷积神经网络训练单元,用于根据所述BEMO标注预测概率与所述第二训练数据真实BEMO标签,计算损失函数交叉熵,并通过第一优化算法对所述损失函数进行最小化,以训练所述第一卷积神经网络;
第二卷积神经网络构建及训练模块,用于构建第二卷积神经网络并基于第二优化算法训练所述第二卷积神经网络,所述第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积单元、一池化层、两层全连接层以及第二softmax函数,包括:
词向量预处理单元:用于对所述第二训练数据进行分词,包括:
初步分词子单元,用于采用jieba库对所述第二训练数据进行初步分词,并比照第一卷积神经网络的预测进行修正,若所述初步分词结果与所述第一卷积神经网络的预测的分词结果不同,以所述第一卷积神经网络的预测的分词结果为准;
分词预处理子单元,用于去除所述初步分词中的特殊符号及非中文字符后,将所述处理后的第二训练数据输入至所述第二嵌入层;
字向量矩阵化单元,用于基于所述词向量词典,于所述第二嵌入层对所述分词后的第二训练数据进行词向量匹配,以将所述第二训练数据转化为矩阵形式;
第二多层卷积单元,用于基于所述第二多层卷积单元对所述第二嵌入层输出的矩阵进行卷积运算,所述第二多层卷积单元包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中保持矩阵行数不变;
池化层,用于将所述第二多层卷积单元的输出输入至所述池化层进行压缩;
全连接层,用于将所述池化层的输出输入至两层所述全连接层中进行各个通道的信息融合;
第二softmax函数输出单元,用于将所述全连接层的输出输入至所述第二softmax函数中,确定多个实体关系标签的对应预测概率,
训练第二卷积神经网络单元,用于依据所述第二卷积神经网络输出的关系标签的预测概率与所述第二训练数据真实关系标签,计算第二损失函数交叉熵,并通过优化算法对所述损失函数进行最小化,以训练所述第二卷积神经网络;
知识图谱三元组提取模块,用于将待预测数据输入至训练完毕的所述第一卷积神经网络与所述第二卷积神经网络中,依据训练后所述第一卷积神经网络输出的实体标注预测与训练后所述第二卷积神经网络输出实体关系预测,提取待预测数据的知识图谱三元组:选取所述BEMO标注预测概率中最高概率值对应的类作为所述第一卷积神经网络输出的实体标注预测,选取预测概率值大于0.5的类作为所述第二卷积神经网络输出实体关系预测,以提取将待预测数据的知识图谱三元组。
为实现上述目的,本申请还提供一种计算机设备,包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现基于TextCNN知识抽取方法,包括以下步骤:
S10收集第一训练数据,构建字向量字典与词向量字典;
S20构建第一卷积神经网络,并基于第一优化算法训练所述第一卷积神经网络;所述第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数;
S21收集第二训练数据,所述第二训练数据为预先标注数据,包括命名实体位置标签与命名实体关系标签,将所述第二训练数据分割为单一文字并去除特殊符号后,输入至所述第一嵌入层中;
S22基于所述字向量字典,于所述第一嵌入层将字符级的第二训练数据进行字向量匹配,以将所述第二训练数据转化为矩阵形式;
S23所述第一多层卷积用于对所述第一嵌入层输出的矩阵进行卷积运算,所述第一多 层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中,保持矩阵行数不变;
S24将所述第一多层卷积经由第一softmax函数进行输出,以确定每一文字的多个细分类别BEMO(其中,B实体开端,M实体中部,E实体结尾,O非实体)标注的预测概率;
S25训练所述第一卷积神经网络,根据所述BEMO标注预测概率与所述第二训练数据真实BEMO标签,计算损失函数交叉熵,并通过第一优化算法对所述损失函数进行最小化,以训练所述第一卷积神经网络;
S30构建第二卷积神经网络,并基于第二优化算法训练所述第二卷积神经网络,所述第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积、一池化层、两层全连接层以及第二softmax函数;
S31对所述第二训练数据进行分词:
S311采用jieba库对所述第二训练数据进行初步分词,并比照第一卷积神经网络的预测进行修正,若所述初步分词结果与所述第一卷积神经网络的预测的分词结果不同,以所述第一卷积神经网络的预测的分词结果为准;
S312去除所述初步分词中的特殊符号及非中文字符后,将所述处理后的第二训练数据输入至所述第二嵌入层;
S32基于所述词向量词典,于所述第二嵌入层对所述分词后的第二训练数据进行词向量匹配,以将所述第二训练数据转化为矩阵形式;
S33基于所述第二多层卷积对所述第二嵌入层输出的矩阵进行卷积运算,所述第二多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中保持矩阵行数不变;
S34将所述第二多层卷积的输出输入至所述池化层进行压缩;
S35将所述池化层的输出输入至两层所述全连接层中进行各个通道的信息融合;
S36将所述全连接层的输出输入至所述第二softmax函数中,用于确定多个实体关系标签的对应预测概率,
S37训练所述第二卷积神经网络,依据所述第二卷积神经网络输出的关系标签的预测概率与所述第二训练数据真实关系标签,计算第二损失函数交叉熵,并通过优化算法对所述损失函数进行最小化,以训练所述第二卷积神经网络;
S40将待预测数据输入至训练完毕的所述第一卷积神经网络与所述第二卷积神经网络中,依据训练后所述第一卷积神经网络输出的实体标注预测与训练后所述第二卷积神经网络输出实体关系预测,提取待预测数据的知识图谱三元组:选取所述BEMO标注预测概率中最高概率值对应的类作为所述第一卷积神经网络输出的实体标注预测,选取预测概率值大于0.5的类作为所述第二卷积神经网络输出实体关系预测,以提取将待预测数据的知识图谱三元组。
为实现上述目的,本申请还提供计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现基于TextCNN知识抽取方法,包括以下步骤:
S10收集第一训练数据,构建字向量字典与词向量字典;
S20构建第一卷积神经网络,并基于第一优化算法训练所述第一卷积神经网络;所述第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数;
S21收集第二训练数据,所述第二训练数据为预先标注数据,包括命名实体位置标签与命名实体关系标签,将所述第二训练数据分割为单一文字并去除特殊符号后,输入至所述第一嵌入层中;
S22基于所述字向量字典,于所述第一嵌入层将字符级的第二训练数据进行字向量匹配,以将所述第二训练数据转化为矩阵形式;
S23所述第一多层卷积用于对所述第一嵌入层输出的矩阵进行卷积运算,所述第一多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中,保持矩阵行数不变;
S24将所述第一多层卷积经由第一softmax函数进行输出,以确定每一文字的多个细分类别BEMO(其中,B实体开端,M实体中部,E实体结尾,O非实体)标注的预测概率;
S25训练所述第一卷积神经网络,根据所述BEMO标注预测概率与所述第二训练数据真实BEMO标签,计算损失函数交叉熵,并通过第一优化算法对所述损失函数进行最小化,以训练所述第一卷积神经网络;
S30构建第二卷积神经网络,并基于第二优化算法训练所述第二卷积神经网络,所述第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积、一池化层、两层全连接层以及第二softmax函数;
S31对所述第二训练数据进行分词:
S311采用jieba库对所述第二训练数据进行初步分词,并比照第一卷积神经网络的预测进行修正,若所述初步分词结果与所述第一卷积神经网络的预测的分词结果不同,以所述第一卷积神经网络的预测的分词结果为准;
S312去除所述初步分词中的特殊符号及非中文字符后,将所述处理后的第二训练数据输入至所述第二嵌入层;
S32基于所述词向量词典,于所述第二嵌入层对所述分词后的第二训练数据进行词向量匹配,以将所述第二训练数据转化为矩阵形式;
S33基于所述第二多层卷积对所述第二嵌入层输出的矩阵进行卷积运算,所述第二多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中保持矩阵行数不变;
S34将所述第二多层卷积的输出输入至所述池化层进行压缩;
S35将所述池化层的输出输入至两层所述全连接层中进行各个通道的信息融合;
S36将所述全连接层的输出输入至所述第二softmax函数中,用于确定多个实体关系标签的对应预测概率,
S37训练所述第二卷积神经网络,依据所述第二卷积神经网络输出的关系标签的预测概率与所述第二训练数据真实关系标签,计算第二损失函数交叉熵,并通过优化算法对所述损失函数进行最小化,以训练所述第二卷积神经网络;
S40将待预测数据输入至训练完毕的所述第一卷积神经网络与所述第二卷积神经网络中,依据训练后所述第一卷积神经网络输出的实体标注预测与训练后所述第二卷积神经网络输出实体关系预测,提取待预测数据的知识图谱三元组:选取所述BEMO标注预测概率中最高概率值对应的类作为所述第一卷积神经网络输出的实体标注预测,选取预测概率值大于0.5的类作为所述第二卷积神经网络输出实体关系预测,以提取将待预测数据的知识图谱三元组。
本申请提供的一种基于TextCNN知识抽取方法、装置、计算机设备及存储介质,通过卷积神经网络实现知识图谱中的知识抽取环节,在保证精度的前提下有效提升了模型训练效率。通过将训练文本转化为向量形式,并接入两类卷积神经网络模型(卷积层形式均为一维卷积核)中以提炼训练文本信息,分别实现命名实体识别以及实体关系识别。由于卷积神经网络具有并行计算特性,因此能够充分利用计算资源实现计算效率提升,训练完成 的两类卷积神经网络模型通过预测融合实现自动化知识抽取。
具体而言,通过将待预测数据分别转化为字向量与词向量形式,并接入至第一卷积神经网络与第二卷积神经网络处理,其中第一卷积神经网络用以实现命名实体识别。所用卷积神经网络为全卷积形式,其输入为字向量,输出为实体类别边界预测。经过这一过程可将原本连续的文本进行分词,并保留命名实体相关文字并进行实体归类;第二卷积神经网络实现知识抽取。所用卷积神经网络包括卷积层、池化层等,其输入包括字向量与词向量,输出为关系抽取识别。经过这一过程,可确定文本中知识实体的关联。结合待预测数据的实体标注预测与实体关系预测,即可识别出待预测数据的实体标注预测与实体关系预测中存在的实体及其相互关系,用于提取待预测数据的知识图谱三元组,从而实现自动化的知识抽取,且在保证精度的前提下有效提升了模型训练效率。
附图说明
图1为本申请基于TextCNN知识抽取方法一实施例的流程图;
图2为本申请基于TextCNN知识抽取装置一实施例的程序模块示意图;
图3为本申请基于TextCNN知识抽取装置一实施例的硬件结构示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
实施例一
本申请提供的一种基于TextCNN的知识抽取方法,如图1所示,包括以下步骤:
S10收集第一训练数据,构建字向量字典与词向量字典;
优选的,步骤S10中,包括:
S11将收集的第一训练数据划分为单独文字并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取字向量并构建字向量字典;
S12同时,将收集的第一训练数据进行分词并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取词向量字典。本实施例中,Word2Vec算法训练通过Python中的gensim库实现。
本申请所示基于TextCNN的知识抽取方法的首先获取字、词向量字典,即确定字词与向量的对应关系。字、词向量分别单独构建向量,第一训练文本均为中文维基百科。对于字向量,首先将训练文本划分为单独文字并去除特殊符号与非中文字符,随后将处理后的文本代入Word2Vec算法进行训练,获取字向量。而对于词向量,首先将训练文本进行分词并去除特殊符号与非中文字符,随后将处理后的文本代入Word2Vec算法进行训练,获取词向量。字向量与词向量的维度均为300。这一步骤中涉及的分词通过Python中的jieba库实现,而这一步骤中涉及的Word2Vec算法训练则通过Python中的gensim库实现。
S20构建第一卷积神经网络,并基于第一优化算法训练第一卷积神经网络;第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数;本实施例中,可基于Python中的tensorflow库建立第一卷积神经网络。
S21收集第二训练数据,第二训练数据为预先标注数据,包括命名实体位置标签与命名实体关系标签,将第二训练数据分割为单一文字并去除特殊符号后,输入至第一嵌入层中;
S22基于字向量字典,于第一嵌入层将字符级的第二训练数据进行字向量匹配,以将第二训练数据转化为矩阵形式;
S23第一多层卷积用于对第一嵌入层输出的矩阵进行卷积运算,第一多层卷积包括位 于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且卷积运算过程中,保持矩阵行数不变;
本实施例中,作为一优选方案,第一多层卷积包括5层卷积层,第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;四组的第二类一维卷积层的一维卷积核长度均为3且通道数为384;
S24将第一多层卷积经由第一softmax函数进行输出,以确定每一文字的多个细分类别BEMO标注的预测概率;
S25训练第一卷积神经网络,根据BEMO标注预测概率与第二训练数据真实BEMO标签,计算损失函数交叉熵,并通过第一优化算法对损失函数进行最小化,以训练第一卷积神经网络;第一优化算法为ADAM算法或为rmsprop算法。
步骤S20中,第二训练文本与字、词向量的第一训练文本不同,其形式为短句形式并包含了命名实体位置标签(对于每一个文字进行BMEO中文命名实体边界标注)以及命名实体关系标签。预处理方面,本步骤中将第二训练文本分割为单一文字并去除特殊符号,并将处理后的文本输入第一卷积神经网络。第一卷积神经网络于嵌入层将字符级的第二训练文本进行字向量匹配,从而将第二训练文本转化为矩阵形式(矩阵每一行对应一个文字的向量)。字向量匹配完毕后,第一卷积神经网络即可进行卷积运算,本实施例中,第一多层卷积共设计了5层卷积层,每一卷积层的数据均来自于前一卷积层输出。其中,位于第一层的第一类一维卷积层包括了3类长度(1、3、5)的一维卷积核各对应128通道,而其余第一类一维卷积层的一维卷积核长度均为3且通道数为384。需要注意的是,卷积层运算过程中需要保持矩阵行数不变。第一多层卷积的最后的卷积层通过第一softmax函数进行输出,这一输出对应每一文字的BEMO标注预测概率。BEMO标注包括多类细分,因此第一卷积神经网络将计算某文字对于每一细分类别的概率,例如“B_地名”的概率、“B_人名”的概率,“E_人名”的概率等。模型构建完毕后,即可进行训练。训练过程中,算法将根据第一卷积神经网络的BEMO标注预测概率与第二训练文本的真实BEMO标签计算损失函数交叉熵,并通过优化算法ADAM对损失函数进行最小化,训练神经网络。需要注意的是,这一模型可能存在文字标注预测矛盾的现象,因此模型仅提取前后文字标注对应的实体。第一卷积神经网络的建立通过Python中的tensorflow库实现。至此,方案完成了第一卷积神经网络的构建及训练。
S30构建第二卷积神经网络,并基于第二优化算法训练第二卷积神经网络,第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积、一池化层、两层全连接层以及第二softmax函数;本实施例中,可基于Python中的tensorflow库建立第二卷积神经网络。
S31对第二训练数据进行分词:
S311采用jieba库对第二训练数据进行初步分词,并比照第一卷积神经网络的预测进行修正,若初步分词结果与第一卷积神经网络的预测的分词结果不同,以第一卷积神经网络的预测的分词结果为准;
S312去除初步分词中的特殊符号及非中文字符后,将处理后的第二训练数据输入至第二嵌入层;
S32基于词向量词典,于第二嵌入层对分词后的第二训练数据进行词向量匹配,以将第二训练数据转化为矩阵形式;
S33基于第二多层卷积对第二嵌入层输出的矩阵进行卷积运算,第二多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且卷积运算过程中保持矩阵行数不变;本实施例中,作为一优选方案,第二多层卷积包括3 层卷积层,第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;两组的第二类一维卷积层的一维卷积核长度均为3且通道数为384。
S34将第二多层卷积的输出输入至池化层进行压缩;
S35将池化层的输出输入至两层全连接层中进行各个通道的信息融合;
S36将全连接层的输出输入至第二softmax函数中,用于确定多个实体关系标签的对应预测概率,
S37训练第二卷积神经网络,依据第二卷积神经网络输出的关系标签的预测概率与第二训练数据真实关系标签,计算第二损失函数交叉熵,并通过优化算法对损失函数进行最小化,以训练第二卷积神经网络;本实施例中,第二优化算法为ADAM算法或为rmsprop算法。
本实施例中,第二卷积神经网络构建所采取的训练文本与第二卷积神经网络相同,但是所采用的标注不是BMEO标注,而是命名实体关系标注。数据预处理方面,由于第一卷积神经网络已标注了命名实体边界,但未标注非命名实体词语边界,因此方案采用jieba库对文本进行初步分词并根据第一卷积神经网络的识别结果对其进行修正,最后去除特殊符号及非中文字符。文本处理完毕后即可输入第二卷积神经网络,并于嵌入层对文本进行词向量匹配(词向量字典中不存在的命名实体向量初始化为0)。第二多层卷积包含前部3层卷积层。其中,位于第一层的第一类一维卷积层包括了3类长度(1、3、5)的一维卷积核各对应128通道,而其余卷积层的一维卷积核长度均为3且通道数为384。需要注意的是,卷积层运算过程中需保持矩阵行数不变。卷积运算输出将输入池化层进行压缩,池化形式为Max-Pooling,池化层的输出将输入两层全连接层,实现各个通道的信息融合。全连接层通过第二softmax函数形式输出多标签的对应预测概率,例如“地理关系”概率、“从属关系”概率等多类标签。模型构建完毕后即可进行训练。训练过程中,算法将根据模型的关系预测与真实关系标签计算损失函数交叉熵,并通过优化算法ADAM对损失函数进行最小化,训练神经网络。第二卷积神经网络的建立通过Python中的tensorflow库实现。至此方案完成了第二卷积神经网络与训练工作。
S40将待预测数据输入至训练完毕的第一卷积神经网络与第二卷积神经网络中,依据训练后第一卷积神经网络输出的实体标注预测与训练后第二卷积神经网络输出实体关系预测,提取待预测数据的知识图谱三元组:选取BEMO标注预测概率中最高概率值对应的类作为第一卷积神经网络输出的实体标注预测,选取预测概率值大于0.5的类作为第二卷积神经网络输出实体关系预测,以提取将待预测数据的知识图谱三元组。此外,步骤S40中,若实体标注预测与实体关系预测结果相互矛盾,则放弃待预测数据的知识图谱三元组提取。
本步骤中,将两类卷积神经网络的预测结果进行融合。由于两类卷积神经网络预测结果可能存在矛盾,例如对于某语句,第一卷积神经网络预测包括人物实体,但是第二卷积神经网络预测属于“地理关系”。因此,方案仅将两类模型预测结果对应的知识进行提取。例如对于某语句,第一卷积神经网络预测包括人物实体,且第二卷积神经网络预测属于“从属关系”,则对于这一知识图谱三元组进行提取。
本申请所示的本申请提供的一种基于TextCNN知识抽取方法通过卷积神经网络实现知识图谱中的知识抽取环节,在保证精度的前提下有效提升了模型训练效率。通过将训练文本转化为向量形式,并接入两类卷积神经网络模型(卷积层形式均为一维卷积核)中以提炼训练文本信息,分别实现命名实体识别以及实体关系识别。由于卷积神经网络具有并行计算特性,因此能够充分利用计算资源实现计算效率提升,训练完成的两类卷积神经网络模型通过预测融合实现自动化知识抽取。具体而言,通过将待预测数据分别转化为字向量与词向量形式,并接入至第一卷积神经网络与第二卷积神经网络处理,其中第一卷积神经网络用以实现命名实体识别。所用卷积神经网络为全卷积形式,其输入为字向量,输出为实体类别边界预测。经过这一过程可将原本连续的文本进行分词,并保留命名实体相关文字并进行实体归类;第二卷积神经网络实现知识抽取。所用卷积神经网络包括卷积层、池 化层等,其输入包括字向量与词向量,输出为关系抽取识别。经过这一过程,可确定文本中知识实体的关联。结合待预测数据的实体标注预测与实体关系预测,即可识别出待预测数据的实体标注预测与实体关系预测中存在的实体及其相互关系,用于提取待预测数据的知识图谱三元组,从而实现自动化的知识抽取,且在保证精度的前提下有效提升了模型训练效率。
实施例二
请继续参阅图2,本申请示出了一种基于TextCNN的知识抽取装置10,以实施例一为基础,用以实现实施例一的基于TextCNN的知识抽取方法,其包括的各程序模块的功能:在本实施例中,基于TextCNN的知识抽取装置10可以包括或被分割成一个或多个程序模块,一个或者多个程序模块被存储于存储介质中,并由一个或多个处理器所执行,以完成本申请,并可实现上述基于TextCNN的知识抽取方法。本申请所称的程序模块是指能够完成特定功能的一系列计算机程序指令段,比程序本身更适合于描述基于TextCNN的知识抽取装置在存储介质中的执行过程。以下描述将具体介绍本实施例各程序模块的功能:
本申请还提供了一种基于TextCNN的知识抽取装置10,包括:
字向量字典构建模块11:用于基于收集的第一训练数据,构建字向量字典;
词向量字典构建模块12:用于基于收集的第一训练数据,构建词向量字典;
第一卷积神经网络构建及训练模块13,用于构建第一卷积神经网络并基于第一优化算法训练第一卷积神经网络;第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数:包括
字向量预处理单元,第二训练数据为预先标注数据,包括命名实体位置标签与命名实体关系标签,字向量预处理模块用于将第二训练数据分割为单一文字并去除特殊符号后,输入至第一嵌入层中;
字向量矩阵化单元,用于基于字向量字典,于第一嵌入层将字符级的第二训练数据进行字向量匹配,以将第二训练数据转化为矩阵形式;
第一多层卷积单元,用于对第一嵌入层输出的矩阵进行卷积运算,第一多层卷积单元包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且卷积运算过程中,保持矩阵行数不变;
第一softmax函数输出单元,用于将第一多层卷积经由第一softmax函数进行输出,以确定每一文字的多个细分类别BEMO标注的预测概率;
第一卷积神经网络训练单元,用于根据BEMO标注预测概率与第二训练数据真实BEMO标签,计算损失函数交叉熵,并通过第一优化算法对损失函数进行最小化,以训练第一卷积神经网络;
第二卷积神经网络构建及训练模块14,用于构建第二卷积神经网络并基于第二优化算法训练第二卷积神经网络,第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积单元、一池化层、两层全连接层以及第二softmax函数,包括:
词向量预处理单元:用于对第二训练数据进行分词,包括:
初步分词子单元,用于采用jieba库对第二训练数据进行初步分词,并比照第一卷积神经网络的预测进行修正,若初步分词结果与第一卷积神经网络的预测的分词结果不同,以第一卷积神经网络的预测的分词结果为准;
分词预处理子单元,用于去除初步分词中的特殊符号及非中文字符后,将处理后的第二训练数据输入至第二嵌入层;
字向量矩阵化单元,用于基于词向量词典,于第二嵌入层对分词后的第二训练数据进行词向量匹配,以将第二训练数据转化为矩阵形式;
第二多层卷积单元,用于基于第二多层卷积单元对第二嵌入层输出的矩阵进行卷积运算,第二多层卷积单元包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且卷积运算过程中保持矩阵行数不变;
池化层,用于将第二多层卷积单元的输出输入至池化层进行压缩;
全连接层,用于将池化层的输出输入至两层全连接层中进行各个通道的信息融合;
第二softmax函数输出单元,用于将全连接层的输出输入至第二softmax函数中,确定多个实体关系标签的对应预测概率,
训练第二卷积神经网络单元,用于依据第二卷积神经网络输出的关系标签的预测概率与第二训练数据真实关系标签,计算第二损失函数交叉熵,并通过优化算法对损失函数进行最小化,以训练第二卷积神经网络;
知识图谱三元组提取模块15,用于将待预测数据输入至训练完毕的第一卷积神经网络与第二卷积神经网络中,依据训练后第一卷积神经网络输出的实体标注预测与训练后第二卷积神经网络输出实体关系预测,提取待预测数据的知识图谱三元组:选取BEMO标注预测概率中最高概率值对应的类作为第一卷积神经网络输出的实体标注预测,选取预测概率值大于0.5的类作为第二卷积神经网络输出实体关系预测,以提取将待预测数据的知识图谱三元组。
优选的,字向量字典构建模块11中,将收集的第一训练数据划分为单独文字并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取字向量并构建字向量字典;
优选的,词向量字典构建模块12中,将收集的第一训练数据进行分词并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取词向量字典。
进一步的,Word2Vec算法训练通过Python中的gensim库实现。
优选的,基于Python中的tensorflow库建立第一卷积神经网络与第二卷积神经网络。
优选的,第一多层卷积包括5层卷积层,第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;四组的第二类一维卷积层的一维卷积核长度均为3且通道数为384;
和/或,第二多层卷积包括3层卷积层,第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;两组的第二类一维卷积层的一维卷积核长度均为3且通道数为384。
优选的,第一优化算法与第二优化算法为ADAM算法或为rmsprop算法。
优选的,知识图谱三元组提取模块15中,若实体标注预测与实体关系预测结果相互矛盾,则放弃待预测数据的知识图谱三元组提取。
本申请所示的本申请提供的一种基于TextCNN知识抽取装置,通过卷积神经网络实现知识图谱中的知识抽取环节,在保证精度的前提下有效提升了模型训练效率。通过将训练文本转化为向量形式,并接入两类卷积神经网络模型(卷积层形式均为一维卷积核)中以提炼训练文本信息,分别实现命名实体识别以及实体关系识别。由于卷积神经网络具有并行计算特性,因此能够充分利用计算资源实现计算效率提升,训练完成的两类卷积神经网络模型通过预测融合实现自动化知识抽取。具体而言,通过将待预测数据分别转化为字向量与词向量形式,并接入至第一卷积神经网络与第二卷积神经网络处理,其中第一卷积神经网络用以实现命名实体识别。所用卷积神经网络为全卷积形式,其输入为字向量,输出为实体类别边界预测。经过这一过程可将原本连续的文本进行分词,并保留命名实体相关文字并进行实体归类;第二卷积神经网络实现知识抽取。所用卷积神经网络包括卷积层、池化层等,其输入包括字向量与词向量,输出为关系抽取识别。经过这一过程,可确定文本中知识实体的关联。结合待预测数据的实体标注预测与实体关系预测,即可识别出待预测数据的实体标注预测与实体关系预测中存在的实体及其相互关系,用于提取待预测数据的知识图谱三元组,从而实现自动化的知识抽取,且在保证精度的前提下有效提升了模型训练效率。
实施例三
本申请还提供一种计算机设备,如可以执行程序的智能手机、平板电脑、笔记本电脑、台式计算机、机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个服务器所组成的服务器集群)等。本实施例的计算机设备20至少包括但不限于:可通过系统总线相互通信连接的存储器21、处理器22,如图3所示。需要指出的是,图3仅示出了具有组件21-22的计算机设备20,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
本实施例中,存储器21(即可读存储介质)包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器21可以是计算机设备20的内部存储单元,例如该计算机设备20的硬盘或内存。在另一些实施例中,存储器21也可以是计算机设备20的外部存储设备,例如该计算机设备20上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,存储器21还可以既包括计算机设备20的内部存储单元也包括其外部存储设备。本实施例中,存储器21通常用于存储安装于计算机设备20的操作系统和各类应用软件,例如实施例一的基于TEXTCNN知识抽取装置10的程序代码等。此外,存储器21还可以用于暂时地存储已经输出或者将要输出的各类数据。
处理器22在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器22通常用于控制计算机设备20的总体操作。本实施例中,处理器22用于运行存储器21中存储的程序代码或者处理数据,例如运行基于TEXTCNN知识抽取装置10,以实现实施例一的基于TEXTCNN知识抽取方法。
实施例四
本申请还提供一种计算机可读存储介质,如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机程序,程序被处理器执行时实现相应功能。本实施例的计算机可读存储介质用于存储基于TEXTCNN知识抽取装置10,被处理器执行时实现实施例一的基于TEXTCNN知识抽取方法。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内

Claims (20)

  1. 一种基于TextCNN的知识抽取方法,其特征在于,包括以下步骤:
    S10收集第一训练数据,构建字向量字典与词向量字典;
    S20构建第一卷积神经网络,并基于第一优化算法训练所述第一卷积神经网络;所述第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数;
    S21收集第二训练数据,所述第二训练数据为预先标注数据,包括命名实体位置标签与命名实体关系标签,将所述第二训练数据分割为单一文字并去除特殊符号后,输入至所述第一嵌入层中;
    S22基于所述字向量字典,于所述第一嵌入层将字符级的第二训练数据进行字向量匹配,以将所述第二训练数据转化为矩阵形式;
    S23所述第一多层卷积用于对所述第一嵌入层输出的矩阵进行卷积运算,所述第一多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中,保持矩阵行数不变;
    S24将所述第一多层卷积经由第一softmax函数进行输出,以确定每一文字的多个细分类别BEMO标注的预测概率;
    S25训练所述第一卷积神经网络,根据所述BEMO标注预测概率与所述第二训练数据真实BEMO标签,计算损失函数交叉熵,并通过第一优化算法对所述损失函数进行最小化,以训练所述第一卷积神经网络;
    S30构建第二卷积神经网络,并基于第二优化算法训练所述第二卷积神经网络,所述第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积、一池化层、两层全连接层以及第二softmax函数;
    S31对所述第二训练数据进行分词:
    S311采用jieba库对所述第二训练数据进行初步分词,并比照第一卷积神经网络的预测进行修正,若所述初步分词结果与所述第一卷积神经网络的预测的分词结果不同,以所述第一卷积神经网络的预测的分词结果为准;
    S312去除所述初步分词中的特殊符号及非中文字符后,将所述处理后的第二训练数据输入至所述第二嵌入层;
    S32基于所述词向量词典,于所述第二嵌入层对所述分词后的第二训练数据进行词向量匹配,以将所述第二训练数据转化为矩阵形式;
    S33基于所述第二多层卷积对所述第二嵌入层输出的矩阵进行卷积运算,所述第二多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中保持矩阵行数不变;
    S34将所述第二多层卷积的输出输入至所述池化层进行压缩;
    S35将所述池化层的输出输入至两层所述全连接层中进行各个通道的信息融合;
    S36将所述全连接层的输出输入至所述第二softmax函数中,用于确定多个实体关系标签的对应预测概率,
    S37训练所述第二卷积神经网络,依据所述第二卷积神经网络输出的关系标签的预测概率与所述第二训练数据真实关系标签,计算第二损失函数交叉熵,并通过优化算法对所述损失函数进行最小化,以训练所述第二卷积神经网络;
    S40将待预测数据输入至训练完毕的所述第一卷积神经网络与所述第二卷积神经网络中,依据训练后所述第一卷积神经网络输出的实体标注预测与训练后所述第二卷积神经网 络输出实体关系预测,提取待预测数据的知识图谱三元组:选取所述BEMO标注预测概率中最高概率值对应的类作为所述第一卷积神经网络输出的实体标注预测,选取预测概率值大于0.5的类作为所述第二卷积神经网络输出实体关系预测,以提取将待预测数据的知识图谱三元组。
  2. 根据权利要求1所述的基于TextCNN的知识抽取方法,其特征在于,所述步骤S10中,包括:
    S11将收集的第一训练数据划分为单独文字并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取字向量并构建字向量字典;
    S12同时,将收集的第一训练数据进行分词并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取词向量字典。
  3. 根据权利要求2所述的基于TextCNN的知识抽取方法,其特征在于,Word2Vec算法训练通过Python中的gensim库实现。
  4. 根据权利要求1所述的基于TextCNN的知识抽取方法,其特征在于,基于Python中的tensorflow库建立所述第一卷积神经网络与所述第二卷积神经网络。
  5. 根据权利要求1所述的基于TextCNN的知识抽取方法,其特征在于,所述第一多层卷积包括5层卷积层,所述第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;四组所述的第二类一维卷积层的一维卷积核长度均为3且通道数为384;
    和/或,所述第二多层卷积包括3层卷积层,所述第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;两组所述的第二类一维卷积层的一维卷积核长度均为3且通道数为384。
  6. 根据权利要求1所述的基于TextCNN的知识抽取方法,其特征在于,所述第一优化算法与所述第二优化算法为ADAM算法或为rmsprop算法。
  7. 根据权利要求1所述的基于TextCNN的知识抽取方法,其特征在于,所述步骤S40中,若所述实体标注预测与所述实体关系预测结果相互矛盾,则放弃待预测数据的知识图谱三元组提取。
  8. 一种基于TextCNN的知识抽取装置,其特征在于,其包括:
    字向量字典构建模块:用于基于收集的第一训练数据,构建字向量字典;
    词向量字典构建模块:用于基于收集的第一训练数据,构建词向量字典;
    第一卷积神经网络构建及训练模块,用于构建第一卷积神经网络并基于第一优化算法训练所述第一卷积神经网络;所述第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数:包括
    字向量预处理单元,所述第二训练数据为预先标注数据,包括命名实体位置标签与命名实体关系标签,所述字向量预处理模块用于将所述第二训练数据分割为单一文字并去除特殊符号后,输入至所述第一嵌入层中;
    字向量矩阵化单元,用于基于所述字向量字典,于所述第一嵌入层将字符级的第二训练数据进行字向量匹配,以将所述第二训练数据转化为矩阵形式;
    第一多层卷积单元,用于对所述第一嵌入层输出的矩阵进行卷积运算,所述第一多层卷积单元包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中,保持矩阵行数不变;
    第一softmax函数输出单元,用于将所述第一多层卷积经由第一softmax函数进行输出,以确定每一文字的多个细分类别BEMO标注的预测概率;
    第一卷积神经网络训练单元,用于根据所述BEMO标注预测概率与所述第二训练数据真实BEMO标签,计算损失函数交叉熵,并通过第一优化算法对所述损失函数进行最小化,以训练所述第一卷积神经网络;
    第二卷积神经网络构建及训练模块,用于构建第二卷积神经网络并基于第二优化算法训练所述第二卷积神经网络,所述第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积单元、一池化层、两层全连接层以及第二softmax函数,包括:
    词向量预处理单元:用于对所述第二训练数据进行分词,包括:
    初步分词子单元,用于采用jieba库对所述第二训练数据进行初步分词,并比照第一卷积神经网络的预测进行修正,若所述初步分词结果与所述第一卷积神经网络的预测的分词结果不同,以所述第一卷积神经网络的预测的分词结果为准;
    分词预处理子单元,用于去除所述初步分词中的特殊符号及非中文字符后,将所述处理后的第二训练数据输入至所述第二嵌入层;
    字向量矩阵化单元,用于基于所述词向量词典,于所述第二嵌入层对所述分词后的第二训练数据进行词向量匹配,以将所述第二训练数据转化为矩阵形式;
    第二多层卷积单元,用于基于所述第二多层卷积单元对所述第二嵌入层输出的矩阵进行卷积运算,所述第二多层卷积单元包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中保持矩阵行数不变;
    池化层,用于将所述第二多层卷积单元的输出输入至所述池化层进行压缩;
    全连接层,用于将所述池化层的输出输入至两层所述全连接层中进行各个通道的信息融合;
    第二softmax函数输出单元,用于将所述全连接层的输出输入至所述第二softmax函数中,确定多个实体关系标签的对应预测概率,
    训练第二卷积神经网络单元,用于依据所述第二卷积神经网络输出的关系标签的预测概率与所述第二训练数据真实关系标签,计算第二损失函数交叉熵,并通过优化算法对所述损失函数进行最小化,以训练所述第二卷积神经网络;
    知识图谱三元组提取模块,用于将待预测数据输入至训练完毕的所述第一卷积神经网络与所述第二卷积神经网络中,依据训练后所述第一卷积神经网络输出的实体标注预测与训练后所述第二卷积神经网络输出实体关系预测,提取待预测数据的知识图谱三元组:选取所述BEMO标注预测概率中最高概率值对应的类作为所述第一卷积神经网络输出的实体标注预测,选取预测概率值大于0.5的类作为所述第二卷积神经网络输出实体关系预测,以提取将待预测数据的知识图谱三元组。
  9. 根据权利要求8所述的基于TextCNN的知识抽取装置,其特征在于,字向量字典构建模块中,将收集的第一训练数据划分为单独文字并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取字向量并构建字向量字典;
  10. 根据权利要求9所述的基于TextCNN的知识抽取装置,其特征在于,词向量字典构建模块中,将收集的第一训练数据进行分词并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取词向量字典。
  11. 根据权利要求8所述的基于TextCNN的知识抽取装置,其特征在于,所述第一多层卷积包括5层卷积层,所述第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;四组所述的第二类一维卷积层的一维卷积核长度均为3且通道数为384;
    和/或,所述第二多层卷积包括3层卷积层,所述第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;两组所述的第二类一维卷积层的一维卷积核长度均为3且通道数为384。
  12. [根据细则26改正13.08.2019]
    一种计算机设备,包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现基于TextCNN的知识抽取方法的以下步骤:
    S10收集第一训练数据,构建字向量字典与词向量字典;
    S20构建第一卷积神经网络,并基于第一优化算法训练所述第一卷积神经网络;所述第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数;
    S21收集第二训练数据,所述第二训练数据为预先标注数据,包括命名实体位置标签与命名实体关系标签,将所述第二训练数据分割为单一文字并去除特殊符号后,输入至所述第一嵌入层中;
    S22基于所述字向量字典,于所述第一嵌入层将字符级的第二训练数据进行字向量匹配,以将所述第二训练数据转化为矩阵形式;
    S23所述第一多层卷积用于对所述第一嵌入层输出的矩阵进行卷积运算,所述第一多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中,保持矩阵行数不变;
    S24将所述第一多层卷积经由第一softmax函数进行输出,以确定每一文字的多个细分类别BEMO标注的预测概率;
    S25训练所述第一卷积神经网络,根据所述BEMO标注预测概率与所述第二训练数据真实BEMO标签,计算损失函数交叉熵,并通过第一优化算法对所述损失函数进行最小化,以训练所述第一卷积神经网络;
    S30构建第二卷积神经网络,并基于第二优化算法训练所述第二卷积神经网络,所述第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积、一池化层、两层全连接层以及第二softmax函数;
    S31对所述第二训练数据进行分词:
    S311采用jieba库对所述第二训练数据进行初步分词,并比照第一卷积神经网络的预测进行修正,若所述初步分词结果与所述第一卷积神经网络的预测的分词结果不同,以所述第一卷积神经网络的预测的分词结果为准;
    S312去除所述初步分词中的特殊符号及非中文字符后,将所述处理后的第二训练数据输入至所述第二嵌入层;
    S32基于所述词向量词典,于所述第二嵌入层对所述分词后的第二训练数据进行词向量匹配,以将所述第二训练数据转化为矩阵形式;
    S33基于所述第二多层卷积对所述第二嵌入层输出的矩阵进行卷积运算,所述第二多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中保持矩阵行数不变;
    S34将所述第二多层卷积的输出输入至所述池化层进行压缩;
    S35将所述池化层的输出输入至两层所述全连接层中进行各个通道的信息融合;
    S36将所述全连接层的输出输入至所述第二softmax函数中,用于确定多个实体关系标签的对应预测概率,
    S37训练所述第二卷积神经网络,依据所述第二卷积神经网络输出的关系标签的预测概率与所述第二训练数据真实关系标签,计算第二损失函数交叉熵,并通过优化算法对所述损失函数进行最小化,以训练所述第二卷积神经网络;
    S40将待预测数据输入至训练完毕的所述第一卷积神经网络与所述第二卷积神经网络中,依据训练后所述第一卷积神经网络输出的实体标注预测与训练后所述第二卷积神经网络输出实体关系预测,提取待预测数据的知识图谱三元组:选取所述BEMO标注预测概率中最高概率值对应的类作为所述第一卷积神经网络输出的实体标注预测,选取预测概率值大于0.5的类作为所述第二卷积神经网络输出实体关系预测,以提取将待预测数据的知识图谱三元组。
  13. 根据权利要求12所述的计算机设备,其特征在于,所述步骤S10中,包括:
    S11将收集的第一训练数据划分为单独文字并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取字向量并构建字向量字典;
    S12同时,将收集的第一训练数据进行分词并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取词向量字典。
  14. 根据权利要求13所述的计算机设备,其特征在于,Word2Vec算法训练通过Python中的gensim库实现。
  15. 根据权利要求12所述的计算机设备,其特征在于,基于Python中的tensorflow库建立所述第一卷积神经网络与所述第二卷积神经网络。
  16. 根据权利要求12所述的计算机设备,其特征在于,所述第一多层卷积包括5层卷积层,所述第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;四组所述的第二类一维卷积层的一维卷积核长度均为3且通道数为384;
    和/或,所述第二多层卷积包括3层卷积层,所述第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;两组所述的第二类一维卷积层的一维卷积核长度均为3且通道数为384。
  17. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现基于TextCNN的知识抽取方法的以下步骤:
    S10收集第一训练数据,构建字向量字典与词向量字典;
    S20构建第一卷积神经网络,并基于第一优化算法训练所述第一卷积神经网络;所述第一卷积神经网络包括依次连接的第一嵌入层、第一多层卷积以及第一softmax函数;
    S21收集第二训练数据,所述第二训练数据为预先标注数据,包括命名实体位置标签与命名实体关系标签,将所述第二训练数据分割为单一文字并去除特殊符号后,输入至所述第一嵌入层中;
    S22基于所述字向量字典,于所述第一嵌入层将字符级的第二训练数据进行字向量匹配,以将所述第二训练数据转化为矩阵形式;
    S23所述第一多层卷积用于对所述第一嵌入层输出的矩阵进行卷积运算,所述第一多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中,保持矩阵行数不变;
    S24将所述第一多层卷积经由第一softmax函数进行输出,以确定每一文字的多个细分类别BEMO标注的预测概率;
    S25训练所述第一卷积神经网络,根据所述BEMO标注预测概率与所述第二训练数据真实BEMO标签,计算损失函数交叉熵,并通过第一优化算法对所述损失函数进行最小化,以训练所述第一卷积神经网络;
    S30构建第二卷积神经网络,并基于第二优化算法训练所述第二卷积神经网络,所述第二卷积神经网络包括依次连接的第二嵌入层、第二多层卷积、一池化层、两层全连接层以及第二softmax函数;
    S31对所述第二训练数据进行分词:
    S311采用jieba库对所述第二训练数据进行初步分词,并比照第一卷积神经网络的预测进行修正,若所述初步分词结果与所述第一卷积神经网络的预测的分词结果不同,以所述第一卷积神经网络的预测的分词结果为准;
    S312去除所述初步分词中的特殊符号及非中文字符后,将所述处理后的第二训练数据输入至所述第二嵌入层;
    S32基于所述词向量词典,于所述第二嵌入层对所述分词后的第二训练数据进行词向量匹配,以将所述第二训练数据转化为矩阵形式;
    S33基于所述第二多层卷积对所述第二嵌入层输出的矩阵进行卷积运算,所述第二多层卷积包括位于前部的一组第一类一维卷积层以及位于后部的至少一组第二类一维卷积层,所述第一类一维卷积层包括长度数量不同、通道数量相同设置一维卷积核,所述第二类一维卷积层包括长度数量与通道数量均相同设置的一维卷积核,每一卷积层的数据均来自于前一卷积层输出,且所述卷积运算过程中保持矩阵行数不变;
    S34将所述第二多层卷积的输出输入至所述池化层进行压缩;
    S35将所述池化层的输出输入至两层所述全连接层中进行各个通道的信息融合;
    S36将所述全连接层的输出输入至所述第二softmax函数中,用于确定多个实体关系标签的对应预测概率;
    S37训练所述第二卷积神经网络,依据所述第二卷积神经网络输出的关系标签的预测概率与所述第二训练数据真实关系标签,计算第二损失函数交叉熵,并通过优化算法对所述损失函数进行最小化,以训练所述第二卷积神经网络;
    S40将待预测数据输入至训练完毕的所述第一卷积神经网络与所述第二卷积神经网络中,依据训练后所述第一卷积神经网络输出的实体标注预测与训练后所述第二卷积神经网络输出实体关系预测,提取待预测数据的知识图谱三元组:选取所述BEMO标注预测概率中最高概率值对应的类作为所述第一卷积神经网络输出的实体标注预测,选取预测概率值大于0.5的类作为所述第二卷积神经网络输出实体关系预测,以提取将待预测数据的知识图谱三元组。
  18. 根据权利要求17所述的计算机可读存储介质,其特征在于,所述步骤S10中,包括:
    S11将收集的第一训练数据划分为单独文字并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取字向量并构建字向量字典;
    S12同时,将收集的第一训练数据进行分词并去除特殊符号与非中文字符后,代入Word2Vec算法进行训练,获取词向量字典。
  19. 根据权利要求17所述的计算机可读存储介质,其特征在于,基于Python中的tensorflow库建立所述第一卷积神经网络与所述第二卷积神经网络。
  20. 根据权利要求17所述的基于TextCNN的知识抽取方法,其特征在于,所述第一多层卷积包括5层卷积层,所述第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;四组所述的第二类一维卷积层的一维卷积核长度均为3且通道数为384;
    和/或,所述第二多层卷积包括3层卷积层,所述第一类一维卷积层包括了3类长度的一维卷积核,各对应128通道;两组所述的第二类一维卷积层的一维卷积核长度均为3且通道数为384。
PCT/CN2019/089563 2019-01-02 2019-05-31 基于TextCNN知识抽取方法、装置、计算机设备及存储介质 WO2020140386A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
SG11202001276TA SG11202001276TA (en) 2019-01-02 2019-05-31 Method, equipment, computing device and computer-readable storage medium for knowledge extraction based on textcnn
US16/635,554 US11392838B2 (en) 2019-01-02 2019-05-31 Method, equipment, computing device and computer-readable storage medium for knowledge extraction based on TextCNN

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910002638.1 2019-01-02
CN201910002638.1A CN109815339B (zh) 2019-01-02 2019-01-02 基于TextCNN知识抽取方法、装置、计算机设备及存储介质

Publications (1)

Publication Number Publication Date
WO2020140386A1 true WO2020140386A1 (zh) 2020-07-09

Family

ID=66603778

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089563 WO2020140386A1 (zh) 2019-01-02 2019-05-31 基于TextCNN知识抽取方法、装置、计算机设备及存储介质

Country Status (4)

Country Link
US (1) US11392838B2 (zh)
CN (1) CN109815339B (zh)
SG (1) SG11202001276TA (zh)
WO (1) WO2020140386A1 (zh)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111832484A (zh) * 2020-07-14 2020-10-27 星际(重庆)智能装备技术研究院有限公司 一种基于卷积感知哈希算法的回环检测方法
CN112084790A (zh) * 2020-09-24 2020-12-15 中国民航大学 一种基于预训练卷积神经网络的关系抽取方法及系统
CN112256873A (zh) * 2020-10-19 2021-01-22 国网浙江杭州市萧山区供电有限公司 一种基于深度学习的变电检修工作任务多标签分类方法
CN112380867A (zh) * 2020-12-04 2021-02-19 腾讯科技(深圳)有限公司 文本处理、知识库的构建方法、装置和存储介质
CN112434790A (zh) * 2020-11-10 2021-03-02 西安理工大学 一种对于卷积神经网络判别部分黑箱问题的自解释方法
CN112426726A (zh) * 2020-12-09 2021-03-02 网易(杭州)网络有限公司 游戏事件抽取方法、装置、存储介质及服务器
CN113065005A (zh) * 2021-05-19 2021-07-02 南京烽火星空通信发展有限公司 一种基于知识图谱和文本分类模型的法律条文推荐方法
CN113157883A (zh) * 2021-04-07 2021-07-23 浙江工贸职业技术学院 一种基于双模型结构的中文意见目标边界预测方法
CN113569773A (zh) * 2021-08-02 2021-10-29 南京信息工程大学 基于知识图谱和Softmax回归的干扰信号识别方法
CN113673434A (zh) * 2021-08-23 2021-11-19 合肥工业大学 一种基于高效卷积神经网络和对比学习的脑电情绪识别方法
CN113780564A (zh) * 2021-09-15 2021-12-10 西北工业大学 融合实体类型信息的知识图谱推理方法、装置、设备及存储介质
CN113806488A (zh) * 2021-09-24 2021-12-17 石家庄铁道大学 一种基于元结构学习的异构图转换的文本挖掘方法
CN114138546A (zh) * 2020-09-03 2022-03-04 中国移动通信集团浙江有限公司 数据备份的方法、装置、计算设备及计算机存储介质
CN114221992A (zh) * 2021-11-12 2022-03-22 国网山西省电力公司电力科学研究院 一种基于跨层指纹的细粒度设备识别方法
CN114511708A (zh) * 2022-01-18 2022-05-17 北京工业大学 基于节点级嵌入特征三维关系重建的图数据相似度方法
CN114694774A (zh) * 2022-02-23 2022-07-01 电子科技大学 一种基于神经网络快速预测多层吸波材料s参数的方法
CN114821169A (zh) * 2022-04-23 2022-07-29 福建福清核电有限公司 微服务架构下的方法级无侵入调用链路追踪方法
CN114817568A (zh) * 2022-04-29 2022-07-29 武汉科技大学 联合注意力机制与卷积神经网络的知识超图链接预测方法
CN115225731A (zh) * 2022-07-29 2022-10-21 中国人民解放军陆军工程大学 一种基于混合神经网络的在线协议识别方法
CN115391414A (zh) * 2022-10-28 2022-11-25 北京双赢天下管理咨询有限公司 一种基于大数据的银行市场拓展系统及方法
CN115757325A (zh) * 2023-01-06 2023-03-07 珠海金智维信息科技有限公司 一种xes日志智能转换方法及系统
CN116095089A (zh) * 2023-04-11 2023-05-09 云南远信科技有限公司 遥感卫星数据处理方法及系统
CN116562760A (zh) * 2023-05-09 2023-08-08 杭州君方科技有限公司 纺织化纤供应链监管方法及其系统
CN116912845A (zh) * 2023-06-16 2023-10-20 广东电网有限责任公司佛山供电局 一种基于nlp与ai的智能内容识别与分析方法及装置

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815339B (zh) * 2019-01-02 2022-02-08 平安科技(深圳)有限公司 基于TextCNN知识抽取方法、装置、计算机设备及存储介质
CN110222693B (zh) * 2019-06-03 2022-03-08 第四范式(北京)技术有限公司 构建字符识别模型与识别字符的方法和装置
CN110442689A (zh) * 2019-06-25 2019-11-12 平安科技(深圳)有限公司 一种问答关系排序方法、装置、计算机设备及存储介质
CN110457677B (zh) * 2019-06-26 2023-11-17 平安科技(深圳)有限公司 实体关系识别方法及装置、存储介质、计算机设备
CN110569500A (zh) * 2019-07-23 2019-12-13 平安国际智慧城市科技股份有限公司 文本语义识别方法、装置、计算机设备和存储介质
CN110516239B (zh) * 2019-08-26 2022-12-09 贵州大学 一种基于卷积神经网络的分段池化关系抽取方法
CN110969015B (zh) * 2019-11-28 2023-05-16 国网上海市电力公司 一种基于运维脚本的标签自动化识别方法和设备
CN111046185B (zh) * 2019-12-16 2023-02-24 重庆邮电大学 一种文本信息的知识图谱关系抽取方法、装置及终端
CN111079442B (zh) * 2019-12-20 2021-05-18 北京百度网讯科技有限公司 文档的向量化表示方法、装置和计算机设备
CN111090749A (zh) * 2019-12-23 2020-05-01 福州大学 一种基于TextCNN的报刊出版物分类方法及系统
CN111405585B (zh) * 2020-03-19 2023-10-03 北京联合大学 一种基于卷积神经网络的邻区关系预测方法
CN111611794A (zh) * 2020-05-18 2020-09-01 众能联合数字技术有限公司 一种基于行业规则和TextCNN模型的通用工程信息提取的方法
CN111951792B (zh) * 2020-07-30 2022-12-16 北京先声智能科技有限公司 一种基于分组卷积神经网络的标点标注模型
CN112235264B (zh) * 2020-09-28 2022-10-14 国家计算机网络与信息安全管理中心 一种基于深度迁移学习的网络流量识别方法及装置
CN114548102A (zh) * 2020-11-25 2022-05-27 株式会社理光 实体文本的序列标注方法、装置及计算机可读存储介质
CN112633927B (zh) * 2020-12-23 2021-11-19 浙江大学 一种基于知识图谱规则嵌入的组合商品挖掘方法
US11625880B2 (en) * 2021-02-09 2023-04-11 Electronic Arts Inc. Machine-learning models for tagging video frames
CN113077118A (zh) * 2021-03-01 2021-07-06 广东电网有限责任公司广州供电局 一种基于互联网智能推送技术的工单推送方法
CN113673336B (zh) * 2021-07-16 2023-09-26 华南理工大学 基于对齐ctc的字符切割方法、系统及介质
CN113822061B (zh) * 2021-08-13 2023-09-08 国网上海市电力公司 一种基于特征图构建的小样本专利分类方法
CN114111764B (zh) * 2021-08-21 2024-01-12 西北工业大学 一种导航知识图谱构建及推理应用方法
CN113807519A (zh) * 2021-08-30 2021-12-17 华中师范大学 一种融入教学反馈与习得理解的知识图谱构建方法
CN113836940B (zh) * 2021-09-26 2024-04-12 南方电网数字电网研究院股份有限公司 电力计量领域的知识融合方法、装置和计算机设备
CN113947161A (zh) * 2021-10-28 2022-01-18 广东工业大学 一种基于注意力机制的多标签文本分类方法及系统
CN114064926A (zh) * 2021-11-24 2022-02-18 国家电网有限公司大数据中心 多模态电力知识图谱构建方法、装置、设备及存储介质
CN114448821A (zh) * 2021-12-03 2022-05-06 航天科工网络信息发展有限公司 一种智能路由方法、装置及网络设备
CN114238524B (zh) * 2021-12-21 2022-05-31 军事科学院系统工程研究院网络信息研究所 基于增强样本模型的卫星频轨数据信息抽取方法
CN114511007B (zh) * 2022-01-17 2022-12-09 上海梦象智能科技有限公司 一种基于多尺度特征感知的非侵入式电气指纹识别方法
CN114330323B (zh) * 2022-03-08 2022-06-28 成都数联云算科技有限公司 实体关系联合抽取方法、装置、计算机终端及存储介质
CN114580424B (zh) * 2022-04-24 2022-08-05 之江实验室 一种用于法律文书的命名实体识别的标注方法和装置
CN114897007B (zh) * 2022-04-26 2024-04-19 太原理工大学 一种复合信息分层卷积神经网络的钻机健康状况评估方法
CN115017945A (zh) * 2022-05-24 2022-09-06 南京林业大学 基于增强型卷积神经网络的机械故障诊断方法和诊断系统
CN115081439B (zh) * 2022-07-01 2024-02-27 淮阴工学院 一种基于多特征自适应增强的化学药品分类方法及系统
CN115994668B (zh) * 2023-02-16 2023-06-20 浙江非线数联科技股份有限公司 智慧社区资源管理系统
CN116907214B (zh) * 2023-05-09 2024-03-08 广东夏和瓷业有限公司 环保日用陶瓷的制备工艺及其系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016165082A1 (zh) * 2015-04-15 2016-10-20 中国科学院自动化研究所 基于深度学习的图像隐写检测方法
CN108563779A (zh) * 2018-04-25 2018-09-21 北京计算机技术及应用研究所 一种基于神经网络的无模板自然语言文本答案生成方法
CN109815339A (zh) * 2019-01-02 2019-05-28 平安科技(深圳)有限公司 基于TextCNN知识抽取方法、装置、计算机设备及存储介质

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005203551A (ja) 2004-01-15 2005-07-28 Suncall Corp 巻線装置
CN202534501U (zh) 2012-03-08 2012-11-14 上海东普电器制造有限公司 新能源大容量变压器感应线圈多层箔绕系统
CN205282279U (zh) 2015-12-25 2016-06-01 旭源电子(珠海)有限公司 变压器半自动包铜箔机
US10817509B2 (en) * 2017-03-16 2020-10-27 Massachusetts Institute Of Technology System and method for semantic mapping of natural language input to database entries via convolutional neural networks
CN107031946A (zh) 2017-06-23 2017-08-11 珠海林顺机电有限公司 铜箔贴胶带装置
CN108108351B (zh) * 2017-12-05 2020-05-22 华南理工大学 一种基于深度学习组合模型的文本情感分类方法
CN207818360U (zh) 2018-01-19 2018-09-04 深圳市海目星激光智能装备股份有限公司 一种变压器的包铜箔设备
CN108182177A (zh) * 2018-01-24 2018-06-19 谢德刚 一种数学试题知识点自动化标注方法和装置
CN108509520B (zh) * 2018-03-09 2021-10-29 中山大学 基于词性和多重cnn的多通道文本分类模型的构建方法
CN108614875B (zh) * 2018-04-26 2022-06-07 北京邮电大学 基于全局平均池化卷积神经网络的中文情感倾向性分类方法
AU2018101513A4 (en) * 2018-10-11 2018-11-15 Hui, Bo Mr Comprehensive Stock Prediction GRU Model: Emotional Index and Volatility Based

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016165082A1 (zh) * 2015-04-15 2016-10-20 中国科学院自动化研究所 基于深度学习的图像隐写检测方法
CN108563779A (zh) * 2018-04-25 2018-09-21 北京计算机技术及应用研究所 一种基于神经网络的无模板自然语言文本答案生成方法
CN109815339A (zh) * 2019-01-02 2019-05-28 平安科技(深圳)有限公司 基于TextCNN知识抽取方法、装置、计算机设备及存储介质

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JW_S: "Convolutional Neural Networks for Sentence Classification,", 13 August 2018 (2018-08-13), pages 1 - 4, XP009521863, Retrieved from the Internet <URL:https://www.cnblogs.com/jws-2018/p/9465605.html> *
YOON KIM: "Convolutional Neural Networks for Sentence Classification", PROCEEDINGS OF THE 2014 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 25 October 2014 (2014-10-25), pages 1746 - 1751, XP055274108, DOI: 10.3115/v1/D14-1181 *

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111832484A (zh) * 2020-07-14 2020-10-27 星际(重庆)智能装备技术研究院有限公司 一种基于卷积感知哈希算法的回环检测方法
CN111832484B (zh) * 2020-07-14 2023-10-27 星际(重庆)智能装备技术研究院有限公司 一种基于卷积感知哈希算法的回环检测方法
CN114138546A (zh) * 2020-09-03 2022-03-04 中国移动通信集团浙江有限公司 数据备份的方法、装置、计算设备及计算机存储介质
CN114138546B (zh) * 2020-09-03 2024-05-10 中国移动通信集团浙江有限公司 数据备份的方法、装置、计算设备及计算机存储介质
CN112084790A (zh) * 2020-09-24 2020-12-15 中国民航大学 一种基于预训练卷积神经网络的关系抽取方法及系统
CN112084790B (zh) * 2020-09-24 2022-07-05 中国民航大学 一种基于预训练卷积神经网络的关系抽取方法及系统
CN112256873A (zh) * 2020-10-19 2021-01-22 国网浙江杭州市萧山区供电有限公司 一种基于深度学习的变电检修工作任务多标签分类方法
CN112256873B (zh) * 2020-10-19 2023-10-24 国网浙江杭州市萧山区供电有限公司 一种基于深度学习的变电检修工作任务多标签分类方法
CN112434790A (zh) * 2020-11-10 2021-03-02 西安理工大学 一种对于卷积神经网络判别部分黑箱问题的自解释方法
CN112434790B (zh) * 2020-11-10 2024-03-29 西安理工大学 一种对于卷积神经网络判别部分黑箱问题的自解释方法
CN112380867A (zh) * 2020-12-04 2021-02-19 腾讯科技(深圳)有限公司 文本处理、知识库的构建方法、装置和存储介质
CN112426726A (zh) * 2020-12-09 2021-03-02 网易(杭州)网络有限公司 游戏事件抽取方法、装置、存储介质及服务器
CN113157883A (zh) * 2021-04-07 2021-07-23 浙江工贸职业技术学院 一种基于双模型结构的中文意见目标边界预测方法
CN113065005A (zh) * 2021-05-19 2021-07-02 南京烽火星空通信发展有限公司 一种基于知识图谱和文本分类模型的法律条文推荐方法
CN113065005B (zh) * 2021-05-19 2024-01-09 南京烽火星空通信发展有限公司 一种基于知识图谱和文本分类模型的法律条文推荐方法
CN113569773A (zh) * 2021-08-02 2021-10-29 南京信息工程大学 基于知识图谱和Softmax回归的干扰信号识别方法
CN113569773B (zh) * 2021-08-02 2023-09-15 南京信息工程大学 基于知识图谱和Softmax回归的干扰信号识别方法
CN113673434A (zh) * 2021-08-23 2021-11-19 合肥工业大学 一种基于高效卷积神经网络和对比学习的脑电情绪识别方法
CN113673434B (zh) * 2021-08-23 2024-02-20 合肥工业大学 一种基于高效卷积神经网络和对比学习的脑电情绪识别方法
CN113780564A (zh) * 2021-09-15 2021-12-10 西北工业大学 融合实体类型信息的知识图谱推理方法、装置、设备及存储介质
CN113780564B (zh) * 2021-09-15 2024-01-12 西北工业大学 融合实体类型信息的知识图谱推理方法、装置、设备及存储介质
CN113806488A (zh) * 2021-09-24 2021-12-17 石家庄铁道大学 一种基于元结构学习的异构图转换的文本挖掘方法
CN113806488B (zh) * 2021-09-24 2024-02-02 石家庄铁道大学 一种基于元结构学习的异构图转换的文本挖掘方法
CN114221992A (zh) * 2021-11-12 2022-03-22 国网山西省电力公司电力科学研究院 一种基于跨层指纹的细粒度设备识别方法
CN114511708A (zh) * 2022-01-18 2022-05-17 北京工业大学 基于节点级嵌入特征三维关系重建的图数据相似度方法
CN114694774A (zh) * 2022-02-23 2022-07-01 电子科技大学 一种基于神经网络快速预测多层吸波材料s参数的方法
CN114821169A (zh) * 2022-04-23 2022-07-29 福建福清核电有限公司 微服务架构下的方法级无侵入调用链路追踪方法
CN114817568A (zh) * 2022-04-29 2022-07-29 武汉科技大学 联合注意力机制与卷积神经网络的知识超图链接预测方法
CN114817568B (zh) * 2022-04-29 2024-05-10 武汉科技大学 联合注意力机制与卷积神经网络的知识超图链接预测方法
CN115225731A (zh) * 2022-07-29 2022-10-21 中国人民解放军陆军工程大学 一种基于混合神经网络的在线协议识别方法
CN115225731B (zh) * 2022-07-29 2024-03-05 中国人民解放军陆军工程大学 一种基于混合神经网络的在线协议识别方法
CN115391414A (zh) * 2022-10-28 2022-11-25 北京双赢天下管理咨询有限公司 一种基于大数据的银行市场拓展系统及方法
CN115391414B (zh) * 2022-10-28 2023-01-13 北京双赢天下管理咨询有限公司 一种基于大数据的银行市场拓展系统及方法
CN115757325B (zh) * 2023-01-06 2023-04-18 珠海金智维信息科技有限公司 一种xes日志智能转换方法及系统
CN115757325A (zh) * 2023-01-06 2023-03-07 珠海金智维信息科技有限公司 一种xes日志智能转换方法及系统
CN116095089A (zh) * 2023-04-11 2023-05-09 云南远信科技有限公司 遥感卫星数据处理方法及系统
CN116562760A (zh) * 2023-05-09 2023-08-08 杭州君方科技有限公司 纺织化纤供应链监管方法及其系统
CN116562760B (zh) * 2023-05-09 2024-04-26 杭州君方科技有限公司 纺织化纤供应链监管方法及其系统
CN116912845B (zh) * 2023-06-16 2024-03-19 广东电网有限责任公司佛山供电局 一种基于nlp与ai的智能内容识别与分析方法及装置
CN116912845A (zh) * 2023-06-16 2023-10-20 广东电网有限责任公司佛山供电局 一种基于nlp与ai的智能内容识别与分析方法及装置

Also Published As

Publication number Publication date
US20210216880A1 (en) 2021-07-15
US11392838B2 (en) 2022-07-19
CN109815339A (zh) 2019-05-28
CN109815339B (zh) 2022-02-08
SG11202001276TA (en) 2020-08-28

Similar Documents

Publication Publication Date Title
WO2020140386A1 (zh) 基于TextCNN知识抽取方法、装置、计算机设备及存储介质
CN110162593B (zh) 一种搜索结果处理、相似度模型训练方法及装置
CN112199375B (zh) 跨模态的数据处理方法、装置、存储介质以及电子装置
WO2020224219A1 (zh) 中文分词方法、装置、电子设备及可读存储介质
WO2020258487A1 (zh) 一种问答关系排序方法、装置、计算机设备及存储介质
WO2021174774A1 (zh) 神经网络关系抽取方法、计算机设备及可读存储介质
WO2021012519A1 (zh) 基于人工智能的问答方法、装置、计算机设备及存储介质
WO2022048363A1 (zh) 网站分类方法、装置、计算机设备及存储介质
CN110598203A (zh) 一种结合词典的军事想定文书实体信息抽取方法及装置
CN111783767B (zh) 文字识别方法、装置、电子设备及存储介质
CN112883193A (zh) 一种文本分类模型的训练方法、装置、设备以及可读介质
CN113051356A (zh) 开放关系抽取方法、装置、电子设备及存储介质
CN110348012B (zh) 确定目标字符的方法、装置、存储介质及电子装置
CN110929119A (zh) 数据标注方法、装置、设备及计算机存储介质
CN116303537A (zh) 数据查询方法及装置、电子设备、存储介质
JP7309811B2 (ja) データ注釈方法、装置、電子機器および記憶媒体
CN116821373A (zh) 基于图谱的prompt推荐方法、装置、设备及介质
CN111625567A (zh) 数据模型匹配方法、装置、计算机系统及可读存储介质
CN113254649B (zh) 敏感内容识别模型的训练方法、文本识别方法及相关装置
CN114357195A (zh) 基于知识图谱的问答对生成方法、装置、设备及介质
CN111191242A (zh) 漏洞信息确定方法、装置、计算机可读存储介质及设备
CN114580354B (zh) 基于同义词的信息编码方法、装置、设备和存储介质
CN115909376A (zh) 文本识别方法、文本识别模型训练方法、装置及存储介质
CN115640376A (zh) 文本标注方法、装置、电子设备和计算机可读存储介质
CN115186738A (zh) 模型训练方法、装置和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19906684

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19906684

Country of ref document: EP

Kind code of ref document: A1