WO2021212749A1 - 命名实体标注方法、装置、计算机设备和存储介质 - Google Patents

命名实体标注方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2021212749A1
WO2021212749A1 PCT/CN2020/118522 CN2020118522W WO2021212749A1 WO 2021212749 A1 WO2021212749 A1 WO 2021212749A1 CN 2020118522 W CN2020118522 W CN 2020118522W WO 2021212749 A1 WO2021212749 A1 WO 2021212749A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
matrix
textcnn
sentence
model
Prior art date
Application number
PCT/CN2020/118522
Other languages
English (en)
French (fr)
Inventor
陈桢博
金戈
徐亮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021212749A1 publication Critical patent/WO2021212749A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the field of artificial intelligence technology, in particular to a named entity labeling method, device, computer equipment and storage medium.
  • NER Named Entity Recognition
  • the task of Named Entity Recognition is mainly to identify and categorize the names of persons, places, institutions, and other proprietary names that appear in the corresponding text. It is a variety of natural languages such as information extraction, information retrieval, question and answer systems, etc.
  • the basis for processing tasks For example, in a resume recognition scenario, it is usually necessary to recognize named entities such as school names and place names in the resume text.
  • the named entity tagging task is a necessary process in named entity recognition, which refers to the process of classifying and tagging each character in the text.
  • the inventor realized that although the traditional deep learning method has good results, since the long-distance features in all sentences are assigned the same feature weight for model operation, the recognition accuracy of short-distance key features cannot be achieved. The desired effect.
  • the main purpose of this application is to provide a named entity labeling method, device, computer equipment and storage medium to overcome the short-distance recognition accuracy of the key features of the named entity labeling is not high.
  • this application provides a named entity labeling method, which includes the following steps:
  • an attention weight matrix between every two characters in the sentence is calculated, and the attention weight matrix is adjusted based on the value vector;
  • the fully connected layer is processed by the TextCNN model and then input to the softmax classification layer fused with Gaussian error for classification, to obtain each of the sentences The first named entity label of the character.
  • This application also provides a named entity marking device, including:
  • the acquiring unit is used to acquire the sentences in the resume text and construct the word vector of the sentences;
  • the first calculation unit is configured to perform a multi-layer convolution operation on the word vector through the multi-layer convolution layer of the TextCNN model obtained in advance to obtain a word vector matrix;
  • the second calculation unit is configured to calculate the word vector matrix based on the fully connected layer of the TextCNN model to obtain a query vector, a key vector, and a value vector;
  • the third calculation unit is used to calculate the attention weight matrix between every two characters in the sentence according to the query vector, the key vector, and the Gaussian deviation matrix, and to calculate the attention weight matrix between every two characters in the sentence based on the value vector. Adjust the attention weight matrix;
  • the classification unit is configured to perform a fully connected layer processing through the TextCNN model based on the word vector matrix and the adjusted attention weight matrix, and then input to the softmax classification layer fused with Gaussian error for classification, to obtain the The first named entity label for each character in the sentence.
  • the present application also provides a computer device, including a memory and a processor, the memory stores a computer program, and the processor implements a named entity labeling method when the computer program is executed, including the following steps:
  • an attention weight matrix between every two characters in the sentence is calculated, and the attention weight matrix is adjusted based on the value vector;
  • the fully connected layer is processed by the TextCNN model and then input to the softmax classification layer fused with Gaussian error for classification, to obtain each of the sentences The first named entity label of the character.
  • the present application also provides a computer-readable storage medium on which a computer program is stored.
  • a method for labeling named entities is implemented, which includes the following steps:
  • an attention weight matrix between every two characters in the sentence is calculated, and the attention weight matrix is adjusted based on the value vector;
  • the fully connected layer is processed by the TextCNN model and then input to the softmax classification layer fused with Gaussian error for classification, to obtain each of the sentences The first named entity label of the character.
  • the named entity labeling method, device, computer equipment, and storage medium include: acquiring sentences in resume text, and constructing word vectors of the sentences; The word vector performs a multi-layer convolution operation to obtain a word vector matrix; based on the fully connected layer of the TextCNN model, the word vector matrix is calculated to obtain a query vector, a key vector, and a value vector; according to the query vector, The key vector is combined with the Gaussian deviation matrix to calculate the attention weight matrix between every two characters in the sentence, and the attention weight matrix is adjusted based on the value vector; based on the word vector matrix , The adjusted attention weight matrix is processed by the TextCNN model in a fully connected layer and then input to the softmax classification layer fused with Gaussian error for classification, to obtain the first named entity of each character in the sentence Label.
  • This application introduces the weighting of the learnable Gaussian deviation matrix, introduces the center position of the local range and the moving window to calculate the Gaussian deviation into the softmax function to modify the weight distribution of the local reinforcement
  • Fig. 1 is a schematic diagram of the steps of a named entity labeling method in an embodiment of the present application
  • Fig. 2 is a structural block diagram of a named entity labeling device in an embodiment of the present application
  • FIG. 3 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.
  • an embodiment of the present application provides a named entity labeling method, which includes the following steps:
  • Step S1 acquiring sentences in the resume text, and constructing word vectors of the sentences;
  • Step S2 performing a multi-layer convolution operation on the word vector through the multi-layer convolution layer of the TextCNN model obtained in advance to obtain a word vector matrix;
  • Step S3 calculating the word vector matrix based on the fully connected layer of the TextCNN model to obtain a query vector, a key vector, and a value vector;
  • Step S4 According to the query vector, the key vector, and the Gaussian deviation matrix, the attention weight matrix between every two characters in the sentence is calculated, and the attention weight matrix is calculated based on the value vector Make adjustments
  • Step S5 Based on the word vector matrix and the adjusted attention weight matrix, the fully connected layer is processed by the TextCNN model and then input to the softmax classification layer fused with Gaussian error for classification to obtain the sentence The first named entity label for each character of.
  • the above-mentioned named entity labeling method is applied in a scenario where named entities such as school names, company names, and professional information in resume text are automatically extracted.
  • the resume text usually includes multiple sentences.
  • each sentence in the resume text is obtained, and a corresponding word vector is constructed for each sentence; It is understood that before constructing the word vector of each sentence, the above sentence can also be preprocessed.
  • the preprocessing includes removing special symbols, stop words and other characters, and converting unformatted text into algorithms that can be processed Format.
  • the embedding layer ie, embedding layer
  • the word vector dictionary in the above embedding layer is obtained by pre-training the Word2Vec or Glove algorithm, and will not be repeated here.
  • the above TextCNN model is an algorithm that uses a convolutional neural network to classify text, which can extract all the information in the text well. It has a forward propagation layer and a backward propagation layer, which are used to learn the above and below information respectively, and are connected to the output layer at the same time.
  • the above-mentioned TextCNN model obtains a word vector matrix by performing operations on the input word vector through a multi-layer convolutional layer.
  • the convolution kernel of the TextCNN model is a one-dimensional form, the length can be set to 2 to 3, the number of convolution channels is set to 128 in this scheme, and the activation function of the convolution layer is ReLU.
  • the above sentence takes the length m as an example.
  • the embedding layer After the embedding layer is processed, it will be converted into a matrix of m*300, and then output m*128 based on the multi-layer convolutional layer processing operation of the TextCNN model The word vector matrix.
  • the word vector matrix is calculated through the fully connected layer of the TextCNN model to obtain three vectors, namely the query vector Q, the key vector K, and the value vector V ,
  • the above three vectors are all m*n matrices.
  • the above-mentioned query vector Q, key vector K, and value vector V are all obtained by performing operations on the same word vector matrix through the fully connected layer, and the difference lies in their different calculation parameters; the purpose of constructing the above-mentioned query vector Q and key vector K is: Calculate the influence weight between characters in the same sentence.
  • the above-mentioned query vector Q and key vector K construct a similar weight matrix, which is used to calculate the weight between the words in the sentence to quantify this influence relationship.
  • the TextCNN model in this embodiment is different from the existing model. It introduces the calculation parameters of the query vector Q, the key vector K, and the value vector V in the fully connected layer, as well as the calculation of the weight matrix and the Gaussian deviation matrix. Parameters; and in the above-mentioned TextCNN model training, after iterative training, the optimal query vector Q, key vector K, value vector V calculation parameters, as well as the weight matrix, Gaussian deviation matrix calculation parameters are obtained.
  • the attention weight matrix between every two characters in the sentence is calculated according to the above query vector, key vector and Gaussian deviation matrix.
  • the learnable The weighting of the Gaussian deviation matrix introduces the center position of the local range and the moving window to calculate the Gaussian deviation and puts it into the softmax function to modify the weight distribution of the local reinforcement, which enhances the ability to capture the local context.
  • the above-mentioned attention weight matrix between each two characters refers to each character, each character in the whole sentence is used to score the character.
  • the score of this scoring determines the importance of the character to the other parts of the sentence. degree.
  • the above-mentioned attention weight matrix can be obtained by multiplying the above-mentioned query vector and the key vector and performing normalization; more specifically, after multiplying the above-mentioned query vector and the key vector, in order to control this result
  • the distribution range of, so as to avoid the occurrence of a maximum value, the gradient update amount is too large, so the vector obtained by the result of the above multiplication is divided by Then normalize again to make the gradient more stable.
  • d is the dimension of the key vector K.
  • the word vector matrix and the adjusted attention weight matrix are first added, and the classification matrix is obtained through the above-mentioned fully connected layer processing, and then the above-mentioned classification matrix is input to the above-mentioned softmax classification layer , Through the classification calculation of the softmax function, the probability of each character belonging to the BIOES label is output, and then the highest probability can be directly output as the first named entity label corresponding to each character, or the CRF algorithm can also be superimposed for label output processing.
  • BIOES labeling method B represents the beginning of the entity
  • I represents the inside of the entity
  • O represents the non-entity
  • E represents the end of the entity
  • S represents the single word entity.
  • a certain character may be marked as the beginning of a person's name, or it may be marked as an entity in a place name.
  • the TextCNN model in this embodiment is different from the existing model. It introduces the calculation parameters of the query vector Q, the key vector K, and the value vector V in the fully connected layer. However, during the training of the above TextCNN model, After iterative training, the calculation parameters of the optimal query vector Q, key vector K, and value vector V are obtained.
  • the step S3 of calculating the word vector matrix based on the fully connected layer of the TextCNN model to obtain a query vector, a key vector, and a value vector includes:
  • the word vector matrix is calculated to obtain the key vector; the purpose of constructing the above query vector Q, the key vector K is to calculate the same sentence The weight of influence between each character in.
  • the word vector matrix is calculated to obtain the value vector.
  • the construction of the above-mentioned value vector is to adjust the above-mentioned attention weight matrix.
  • the fully connected layer is processed by the TextCNN model and then input to the softmax classification layer fused with Gaussian error for classification, After the step S5 of obtaining the first named entity label of each character in the sentence, the method includes:
  • Step S6 adding each character in the sentence to the named entity label obtained by classification to generate a first training sample
  • Step S7 Perform replacement sampling on the first training sample to obtain multiple training sample sets; and train an initial TextCNN model based on each training sample set to obtain a corresponding number of TextCNN sub-models;
  • Step S8 input the same unlabeled resume text into all the TextCNN sub-models to output the named entity labeling results predicted by each of the TextCNN sub-models;
  • Step S9 It is judged whether the named entity annotation results predicted by all the TextCNN sub-models are the same. If they are the same, verify that the TextCNN sub-model training is completed, and verify that the first named entity annotation of each character in the sentence is correct.
  • the training set used by the above-mentioned TextCNN sub-model is all text in the resume field. After the above-mentioned iterative training, it is more targeted to the professional field.
  • multiple sets of TextCNN sub-models used are trained at the same time, and only when all the results are the same can it be verified that the training is finally completed.
  • multiple groups of TextCNN sub-models are trained at the same time. When all the results are the same, it can also indicate that the first named entity of each character in the above sentence is correctly labeled.
  • the step S1 of obtaining the sentence in the resume text and constructing the word vector of the sentence includes:
  • Step S11 Obtain a resume document; the resume document may be a word electronic document or a picture.
  • Step S12 input the resume document into a preset text detection model to detect each text area in the resume document; wherein, the text detection model is obtained by training based on a natural scene text detection model; the above-mentioned text detection The model is used to detect the area where text appears in the resume document. It is only used to detect the area where the text is located, not to identify the specific text in the area.
  • step S13 a mark box is added outside each of the text areas; after the mark box is added, the corresponding text area can be easily recognized, and the subsequent recognition processing amount can be simplified.
  • Step S14 Recognize each of the marked boxes based on image recognition technology, and perform text recognition on the text content in each of the marked boxes through a text recognition model, so as to recognize the text information in each of the marked boxes, and recognize Each of the text information obtained is regarded as a sentence; after each marked box is recognized, the text recognition model can be used to directly identify the text content in each marked box, and the content in each marked box is used as a Statement.
  • step S15 a word vector corresponding to each character in each sentence is constructed based on the preset word embedding model.
  • the above-mentioned word embedding model is obtained through Word2Vec or Glove algorithm training, and is used to convert the characters in each sentence into a corresponding word vector.
  • the attention weight matrix between every two characters in the sentence is calculated according to the query vector, the key vector, and the Gaussian deviation matrix, and the matrix is calculated based on the value vector.
  • the step S4 of adjusting the attention weight matrix includes:
  • Step S41 According to the query vector and the key vector, the weight matrix is calculated based on the corresponding weight matrix calculation parameter; the Gaussian is calculated based on the corresponding Gaussian deviation matrix calculation parameter according to the query vector and the key vector. Deviation matrix; wherein the calculation parameters used in the calculation of the weight matrix and the Gaussian deviation matrix are different; it should be understood that the calculation parameters are obtained by iterative training of the TextCNN model.
  • Step S42 adding the weight matrix and the Gaussian deviation matrix, and performing normalization processing to obtain the attention weight matrix
  • Step S43 Perform multiplication calculation on the attention weight matrix and the value vector to adjust the attention weight matrix.
  • the calculation parameters used for calculating the foregoing M and G are different; it should be understood that the foregoing calculation parameters are obtained when the foregoing TextCNN model is iteratively trained.
  • the Gaussian error matrix G is used to adjust the weight matrix M to introduce the center position of the local range and the moving window to calculate the Gaussian deviation and put it into the softmax function to correct the weight distribution of the local reinforcement, thereby enhancing the ability to capture the local context.
  • the above-obtained attention weight matrix is multiplied by the above-mentioned value vector, that is, ATT*V.
  • the obtained attention weight matrix is constructed and used as the weight for the calculation of the value vector V. It is understandable that during training of the above model, under supervised learning tasks, the optimization algorithm will automatically optimize the parameters according to the results to obtain the optimal calculation parameters; thus, it is convenient to find the optimal calculation parameters in the specific prediction process of the model. Matrices Q and K, so an accurate attention weight matrix can be obtained.
  • the fully connected layer is processed by the TextCNN model and then input to the softmax classification layer fused with Gaussian error for classification,
  • the calculation process for obtaining the first named entity label of each character in the sentence is:
  • the Gaussian deviation matrix G is added to the softmax activation function in the softmax classification layer, where the Gaussian deviation matrix G is an L*L matrix, L is the character length in the sentence, and G ij measures the character x j closeness between the predicted and the center position P i, D i is double the window size is the Gaussian error.
  • G ij is:
  • W p is trainable linear mapping
  • Q i is the query vector
  • the weighting of the learnable Gaussian error is introduced, and the center position of the local range and the moving window are introduced to calculate the Gaussian error and put into the softmax function to correct the weight distribution of the local reinforcement, so as to ensure that the long-distance dependence is obtained.
  • the weighting of the learnable Gaussian error is introduced, and the center position of the local range and the moving window are introduced to calculate the Gaussian error and put into the softmax function to correct the weight distribution of the local reinforcement, so as to ensure that the long-distance dependence is obtained.
  • an embodiment of the present application also provides a named entity tagging device, including:
  • the obtaining unit 10 is used to obtain sentences in the resume text and construct a word vector of the sentences;
  • the first calculation unit 20 is configured to perform a multi-layer convolution operation on the word vector through the multi-layer convolution layer of the TextCNN model obtained by pre-training to obtain a word vector matrix;
  • the second calculation unit 30 is configured to calculate the word vector matrix based on the fully connected layer of the TextCNN model to obtain a query vector, a key vector, and a value vector;
  • the third calculation unit 40 is configured to calculate the attention weight matrix between every two characters in the sentence according to the query vector, the key vector, and the Gaussian deviation matrix, and to compare the weights based on the value vector.
  • the attention weight matrix is adjusted;
  • the classification unit 50 is configured to perform a fully connected layer processing through the TextCNN model based on the word vector matrix and the adjusted attention weight matrix, and then input it into the softmax classification layer fused with Gaussian error for classification to obtain The first named entity label for each character in the sentence.
  • the second calculation unit 30 includes:
  • the first calculation subunit is configured to calculate the word vector matrix based on the query vector calculation parameters obtained in advance in the fully connected layer of the TextCNN model to obtain the query vector;
  • the second calculation subunit is used to calculate the word vector matrix based on the key vector calculation parameters obtained by pre-training in the fully connected layer of the TextCNN model to obtain the key vector;
  • the third calculation subunit is used to calculate the parameter of the value vector obtained by pre-training in the fully connected layer of the TextCNN model, and calculate the word vector matrix to obtain the value vector.
  • the named entity tagging device further includes:
  • a generating unit configured to add each character in the sentence to the named entity label obtained by classification to generate a first training sample
  • the training unit is used to perform replacement sampling on the first training sample to obtain multiple training sample sets; and to train an initial TextCNN model based on each training sample set to obtain a corresponding number of TextCNN sub-models;
  • the output unit is used to input the same unlabeled resume text into all the TextCNN sub-models to output the named entity labeling results predicted by each of the TextCNN sub-models;
  • the verification unit is used to determine whether the named entity annotation results predicted by all the TextCNN sub-models are the same, if they are the same, verify that the TextCNN sub-model training is completed, and verify the first named entity annotation of each character in the sentence correct.
  • the acquiring unit 10 includes:
  • the detection subunit is used to input the resume document into a preset text detection model to detect each text area in the resume document; wherein the text detection model is obtained by training based on a natural scene text detection model;
  • the recognition subunit is used to recognize each of the marked boxes based on image recognition technology, and perform text recognition on the text content in each of the marked boxes through a text recognition model to recognize the text information in each of the marked boxes, And use each of the recognized text information as a sentence;
  • the construction subunit is used to construct the word vector corresponding to each character in each sentence based on the preset word embedding model.
  • the third calculation unit 40 includes:
  • the fourth calculation subunit is configured to calculate a weight matrix based on the corresponding weight matrix calculation parameters according to the query vector and the key vector;
  • a fifth calculation subunit configured to calculate the Gaussian deviation matrix based on the corresponding Gaussian deviation matrix calculation parameters according to the query vector and the key vector;
  • An addition subunit configured to add the weight matrix and the Gaussian deviation matrix, and perform normalization processing to obtain the attention weight matrix
  • the adjustment subunit is configured to perform multiplication calculation on the attention weight matrix and the value vector to adjust the attention weight matrix.
  • an embodiment of the present application also provides a computer device.
  • the computer device may be a server, and its internal structure may be as shown in FIG. 3.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor designed by the computer is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, a computer program, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the database of the computer equipment is used to store text data, training data, etc.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • an attention weight matrix between every two characters in the sentence is calculated, and the attention weight matrix is adjusted based on the value vector;
  • the fully connected layer is processed by the TextCNN model and then input to the softmax classification layer fused with Gaussian error for classification, to obtain each of the sentences The first named entity label of the character.
  • FIG. 3 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • An embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
  • a method for marking a named entity is implemented, which includes the following steps:
  • the attention weight matrix between every two characters in the sentence is calculated, and the attention weight matrix is adjusted based on the value vector;
  • the fully connected layer is processed by the TextCNN model and then input to the softmax classification layer fused with Gaussian error for classification, to obtain each of the sentences The first named entity label of the character.
  • the computer-readable storage medium in this embodiment may be a volatile readable storage medium or a non-volatile readable storage medium.
  • the named entity labeling method, device, computer equipment, and storage medium include: obtaining sentences in the resume text, and constructing word vectors of the sentences; TextCNN obtained through pre-training The multi-layer convolution layer of the model performs multi-layer convolution operations on the word vector to obtain a word vector matrix; based on the fully connected layer of the TextCNN model, the word vector matrix is calculated to obtain query vectors, key vectors, Value vector; according to the query vector, key vector, and Gaussian deviation matrix, the attention weight matrix between every two characters in the sentence is calculated, and the attention weight matrix is calculated based on the value vector Make adjustments; based on the word vector matrix and the adjusted attention weight matrix, the fully connected layer is processed by the TextCNN model and then input to the softmax classification layer fused with Gaussian error for classification, to obtain the sentence The first named entity label for each character of.
  • This application introduces the weighting of the learnable Gaussian deviation matrix, introduces the center position of the local range and the moving window to calculate the Gaussian deviation and puts it into the softmax function to modify the weight distribution of the local reinforcement, which enhances the ability to capture the local context.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请涉及人工智能技术领域,提供一种命名实体标注方法、装置、计算机设备和存储介质,包括:构建简历文本中语句的字向量;通过TextCNN模型的多层卷积层对字向量进行多层卷积运算,得到字向量矩阵;对字向量矩阵进行计算,得到查询向量、键向量、值向量;从而计算得到语句中的每两个字符之间的注意力权重矩阵,并基于值向量对进行调整;基于字向量矩阵、调整后的注意力权重矩阵,通过全连接层处理后输入至融合高斯误差的softmax分类层中进行分类,得到语句中的每个字符的第一命名实体标注。本申请引入了高斯偏差矩阵的的加权,引入局部范围的中心位置和移动窗口来计算高斯偏差放入softmax函数中以修正局部强化的权重分布,增强了捕获局部上下文的能力。

Description

命名实体标注方法、装置、计算机设备和存储介质
本申请要求于2020年04月24日提交中国专利局、申请号为202010333674.9,发明名称为“命名实体标注方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,特别涉及一种命名实体标注方法、装置、计算机设备和存储介质。
背景技术
命名实体识别(Named Entity Recognition,NER)任务主要是识别出对应文本中出现的人名、地名、机构名等专有名称并加以归类,它是信息抽取、信息检索、问答系统等多种自然语言处理任务的基础。例如,在简历识别场景中,通常需要识别简历文本中的校名、地名等命名实体。
命名实体标注任务是命名实体识别中的必要过程,其是指对文本中每个字符进行分类标注处理的过程。发明人意识到,传统的深度学习方法虽然有较好的效果,但是由于对所有语句中的长距离特征都赋予相同的特征权重进行模型运算,所以在短距离关键特征的识别精度上达不到理想的效果。
技术问题
本申请的主要目的为提供一种命名实体标注方法、装置、计算机设备和存储介质,克服命名实体标注时在短距离关键特征的识别精度不高的缺陷。
技术解决方案
为实现上述目的,本申请提供了一种命名实体标注方法,包括以下步骤:
获取简历文本中的语句,并构建所述语句的字向量;
通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
本申请还提供了一种命名实体标注装置,包括:
获取单元,用于获取简历文本中的语句,并构建所述语句的字向量;
第一计算单元,用于通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
第二计算单元,用于基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算, 得到查询向量、键向量、值向量;
第三计算单元,用于根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
分类单元,用于基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
本申请还提供一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器执行所述计算机程序时实现一种命名实体标注方法,包括以下步骤:
获取简历文本中的语句,并构建所述语句的字向量;
通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
本申请还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现一种命名实体标注方法,包括以下步骤:
获取简历文本中的语句,并构建所述语句的字向量;
通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
有益效果
本申请提供的命名实体标注方法、装置、计算机设备和存储介质,包括:获取简历文本中的语句,并构建所述语句的字向量;通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;基于所述TextCNN模型的全连接层, 对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。本申请引入了可学习的高斯偏差矩阵的的加权,引入局部范围的中心位置和移动窗口来计算高斯偏差放入softmax函数中以修正局部强化的权重分布,增强了捕获局部上下文的能力。
附图说明
图1是本申请一实施例中命名实体标注方法步骤示意图;
图2是本申请一实施例中命名实体标注装置结构框图;
图3为本申请一实施例的计算机设备的结构示意框图。
本发明的最佳实施方式
参照图1,本申请一实施例中提供了一种命名实体标注方法,包括以下步骤:
步骤S1,获取简历文本中的语句,并构建所述语句的字向量;
步骤S2,通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
步骤S3,基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
步骤S4,根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
步骤S5,基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
在本实施例中,上述命名实体标注方法应用于自动化提取简历文本中的学校名、公司名、专业信息等命名实体的场景中。
如上述步骤S1所述的,在本实施例中,简历文本中通常包括有多个语句,本实施例中获取出简历文本中的每一个语句,并针对每一个语句构建对应的字向量;可以理解的是,在构建每个所述语句的字向量之前,还可以对上述语句进行预处理,预处理包括去除特殊符号、停用词等字符,并将非格式化文本转为算法可以运算处理的格式。预处理完成后,输入词嵌入模型的嵌入层(即embedding层)以将上述语句中的每一文字(字符)转化为对应的字向量(该字向量通常为300维)。上述嵌入层中的字向量字典预先通过Word2Vec或Glove算法训练得到,在此不做赘述。
如上述步骤S2所述的,上述TextCNN模型是利用卷积神经网络对文本进行分类的算法,其可以很好的提取文本中所有的信息。其所具有的向前传播层和向后传播层,分别用于学习上文和下文信息,同时连接到输出层。在本实施例中,上述TextCNN模型通过多层 卷积层对输入的字向量进行运算得到字向量矩阵。其中,TextCNN模型的卷积核为1维形式,长度可设置为2至3,卷积通道数在本方案中设置为128,卷积层激活函数为ReLU。在本实施例中,上述语句以长度为m为例,进过嵌入层处理后将被转为m*300的矩阵,随后基于所述TextCNN模型的多层卷积层处理运算后输出m*128的字向量矩阵。
如上述步骤S3所述的,在上述步骤运算得到字向量矩阵之后,通过TextCNN模型的全连接层对上述字向量矩阵进行运算得到三个向量,分别为查询向量Q,键向量K,值向量V,上述三个向量均为m*n的矩阵。上述查询向量Q,键向量K,值向量V均是通过全连接层针对同一个字向量矩阵进行运算得到,区别仅在于其计算参数不同;而构建上述查询向量Q,键向量K的目的在于,计算同一语句中各字符之间的影响权重。当识别命名实体时,需要参考语句其他位置的文字进行判断,则必然需要考虑其他文字的影响权重。上述查询向量Q与键向量K则构造了类似的权重矩阵,用于计算语句中各文字之间的权重,以量化这一影响关系。
应当理解的是,本实施例中的TextCNN模型与现有模型不同,其在全连接层中引入了查询向量Q,键向量K,值向量V的计算参数,以及权重矩阵、高斯偏差矩阵的计算参数;而在上述TextCNN模型训练时,经过迭代训练,得到最优的查询向量Q,键向量K,值向量V的计算参数,以及权重矩阵、高斯偏差矩阵的计算参数。
如上述步骤S4所述的,根据上述查询向量、键向量,结合高斯偏差矩阵计算得到所述语句中的每两个字符之间的注意力权重矩阵,在本实施例中,引入了可学习的高斯偏差矩阵的的加权,引入局部范围的中心位置和移动窗口来计算高斯偏差放入softmax函数中以修正局部强化的权重分布,增强了捕获局部上下文的能力。
上述每两个字符之间的注意力权重矩阵指的是针对每一个字符,用整句话中每一个字符来对该字符进行打分,这个打分的分数决定了该字符对句子其他部分字符的重视程度。具体地,将上述查询向量与键向量进行相乘并进行归一化可得到上述注意力权重矩阵;更为具体的是,在将上述查询向量与键向量进行相乘之后,为了控制这一结果的分布范围,以免出现极大值导致梯度更新量过大,因此将上述相乘的结果所得到的向量除以
Figure PCTCN2020118522-appb-000001
然后再进行归一化,从而使得梯度更加稳定。其中,d是键向量K的维度。
如上述步骤S5所述的,将字向量矩阵、调整后的所述注意力权重矩阵首先进行相加,并通过上述全连接层处理,得到分类矩阵,再将上述分类矩阵输入至上述softmax分类层,通过softmax函数的分类计算,输出每一字符所属BIOES标注的概率,随后可直接输出概率最高者作为每个字符对应的第一命名实体标注,或者也可以叠加CRF算法进行标注输出处理。
其中,在本实施例中,采用BIOES标注方式,B代表实体开头,I代表实体内部,O代表非实体,E代表实体结尾,S表示单字实体。而对于不同类型的命名实体,也需要进行对应的区分;例如,某个字符可能标注为人名中的开头,也可能标注为地名中的实体内部。
在一实施例中,本实施例中的TextCNN模型与现有模型不同,其在全连接层中引入了查询向量Q,键向量K,值向量V的计算参数,而在上述TextCNN模型训练时,经过迭代训练,得到最优的查询向量Q,键向量K,值向量V的计算参数。
因此,所述基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量的步骤S3,包括:
基于所述TextCNN模型的全连接层中预先训练得到的查询向量计算参数,对所述字向量矩阵进行计算,得到所述查询向量;
基于所述TextCNN模型的全连接层中预先训练得到的键向量计算参数,对所述字向量矩阵进行计算,得到所述键向量;构建上述查询向量Q,键向量K的目的在于,计算同一语句中各字符之间的影响权重。
基于所述TextCNN模型的全连接层中预先训练得到的值向量计算参数,对所述字向量矩阵进行计算,得到所述值向量。构建上述值向量,则为了对上述注意力权重矩阵进行调整。
在一实施例中,所述基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注的步骤S5之后,包括:
步骤S6,将所述语句中的每个字符添加分类得到的所述命名实体标注,生成第一训练样本;
步骤S7,对所述第一训练样本进行放回抽样,得到多组训练样本集;并基于每一组训练样本集分别对一个初始TextCNN模型进行训练,得到对应个数的TextCNN子模型;
步骤S8,将同一个无标注简历文本输入至所有的所述TextCNN子模型中,以输出每一个所述TextCNN子模型预测的命名实体标注结果;
步骤S9,判断所有所述TextCNN子模型预测的命名实体标注结果是否相同,若相同,则验证所述TextCNN子模型训练完成,以及验证所述语句中的每个字符的第一命名实体标注正确。
可以理解的是,上述TextCNN子模型采用的训练集均为简历领域中的文本,在经过上述迭代训练之后,其对专业领域的针对性更强。同时,在训练上述TextCNN子模型时,采用的多组TextCNN子模型同时训练,只有当所有的结果都相同时,才可以验证为训练最终完成。同时,多组TextCNN子模型同时训练,当所有的结果都相同时,也可以表明上述语句中的每个字符的第一命名实体标注正确。
而在后续使用上述TextCNN子模型进行命名实体标注时,也可以是将同一简历文本输入到多个TextCNN子模型中进行预测,只有当所有TextCNN子模型预测的命名实体标注结果都相同时,才将相同的命名实体标注结果作为该简历文本的命名实体标注结果。
在一实施例中,所述获取简历文本中的语句,并构建所述语句的字向量的步骤S1,包括:
步骤S11,获取简历文档;该简历文档可以是word电子文档或者图片。
步骤S12,将所述简历文档输入至预设的文本检测模型中,以检测所述简历文档中的各个文字区域;其中,所述文本检测模型为基于自然场景文本检测模型训练得到;上述文本检测模型用于检测出上述简历文档中出现文本的区域,其仅用于检测出文本所在的区域,不用于识别出区域内的文字具体是什么。
步骤S13,分别在各个所述文字区域外添加一个标记框;添加标记框之后可以便于识别出对应的文字区域,可以简化后续的识别处理量。
步骤S14,基于图像识别技术识别每个所述标记框,并通过文字识别模型对各个所述标记框中的文字内容进行文字识别,以识别到各个所述标记框中的文字信息,并将识别到的各所述文字信息分别作为一个语句;在识别出每个标记框后,采用文字识别模型则可以直接识别出每个标记框中的文字内容,而每一个标记框中的内容分别作为一个语句。
步骤S15,基于预设的词嵌入模型构建每一个语句中每一个字符对应的字向量。上述词嵌入模型通过Word2Vec或Glove算法训练得到,用于将每一个语句中的字符转换成对应的一个字向量。
在一实施例中,所述根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整的步骤S4,包括:
步骤S41,根据所述查询向量与所述键向量,基于对应的权重矩阵计算参数计算得到权重矩阵;根据所述查询向量与所述键向量,基于对应的高斯偏差矩阵计算参数计算得到所述高斯偏差矩阵;其中,计算得到上述权重矩阵与高斯偏差矩阵所采用的计算参数不同;应当理解的是,上述计算参数由迭代训练上述TextCNN模型时所得到。
步骤S42,将所述权重矩阵与所述高斯偏差矩阵进行加和,并进行归一化处理后得到所述注意力权重矩阵;
步骤S43,将所述注意力权重矩阵与所述值向量进行相乘计算,以对所述注意力权重矩阵进行调整。
其中,计算得到上述M与G所采用的计算参数不同;应当理解的是,上述计算参数由迭代训练上述TextCNN模型时所得到。
上述高斯误差矩阵G用于对上述权重矩阵M进行调整,以引入局部范围的中心位置和移动窗口来计算高斯偏差放入softmax函数中以修正局部强化的权重分布,增强了捕获局部上下文的能力。
进而,根据上述权重矩阵M以及高斯偏差矩阵G;计算得到注意力权重矩阵ATT,其中,ATT(Q,K)=Softmax(M+G)。
为了对上述注意力权重矩阵进行调整,将上述得到的注意力权重矩阵与上述值向量进行相乘,即ATT*V。在本实施例中,构造得到的注意力权重矩阵,将其作为权重用于值向量V的计算。可以理解的是,上述模型在训练时,在监督学习任务下,优化算法会根据结果对参数进行自动优化,以得到最优的计算参数;从而在模型的具体预测过程中,便于寻找最优的矩阵Q与K,因此能够获得准确的注意力权重矩阵。
在一实施例中,所述基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注的计算过程为:
将所述字向量矩阵与调整后的所述注意力权重矩阵进行结合,即得到L1=C+ATT*V1;其中,C为字向量矩阵;再通过全连接层处理,即得到L2=FC(L1);最后通过softmax分类层进行分类,得到上述语句中每一个字符所属的BIOES标注的概率,即L3=softmax(L2)。其中,通常将概率最高的标注最为该字符对应的标注结果。
在本实施例中,将高斯偏差矩阵G加入到softmax分类层中的softmax激活函数中,其中高斯偏差矩阵G是一个L*L的矩阵,L是语句中的字符长度,G ij衡量了字符x j和所预测的中心位置P i之间的紧密度,D i是窗口大小也是两倍的高斯误差。
上述每两个字符之间的注意力权重矩阵为:
ATT(Q,K)=Softmax(M+G)
G ij为:
Figure PCTCN2020118522-appb-000002
其中,P i和D i的计算方法如下:为了使得P i和D i介于0到L之间,加入了标注化因子L。因为每一个中心位置取决于对应的查询向量,所以应用向前反馈机制将该向量转换为隐藏状态,用线性映射映射到标量。
Figure PCTCN2020118522-appb-000003
Figure PCTCN2020118522-appb-000004
Figure PCTCN2020118522-appb-000005
其中
Figure PCTCN2020118522-appb-000006
和W p是可训练的线性映射,Q i是查询向量。
在本实施例中,引入了可学习的高斯误差的的加权,引入局部范围的中心位置和移动窗口来计算高斯误差放入softmax函数中以修正局部强化的权重分布,在保证获得长距离依赖性的同时学习了小范围内部的邻关系,增强了捕获局部上下文的能力。
参照图2,本申请一实施例中还提供了一种命名实体标注装置,包括:
获取单元10,用于获取简历文本中的语句,并构建所述语句的字向量;
第一计算单元20,用于通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
第二计算单元30,用于基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
第三计算单元40,用于根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
分类单元50,用于基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
在一实施例中,所述第二计算单元30,包括:
第一计算子单元,用于基于所述TextCNN模型的全连接层中预先训练得到的查询向量计算参数,对所述字向量矩阵进行计算,得到所述查询向量;
第二计算子单元,用于基于所述TextCNN模型的全连接层中预先训练得到的键向量计算参数,对所述字向量矩阵进行计算,得到所述键向量;
第三计算子单元,用于基于所述TextCNN模型的全连接层中预先训练得到的值向量计算参数,对所述字向量矩阵进行计算,得到所述值向量。
在一实施例中,所述命名实体标注装置,还包括:
生成单元,用于将所述语句中的每个字符添加分类得到的所述命名实体标注,生成第一训练样本;
训练单元,用于对所述第一训练样本进行放回抽样,得到多组训练样本集;并基于每一组训练样本集分别对一个初始TextCNN模型进行训练,得到对应个数的TextCNN子模型;
输出单元,用于将同一个无标注简历文本输入至所有的所述TextCNN子模型中,以输出每一个所述TextCNN子模型预测的命名实体标注结果;
验证单元,用于判断所有所述TextCNN子模型预测的命名实体标注结果是否相同,若相同,则验证所述TextCNN子模型训练完成,以及验证所述语句中的每个字符的第一命名实体标注正确。
在一实施例中,所述获取单元10,包括:
获取子单元,用于获取简历文档;
检测子单元,用于将所述简历文档输入至预设的文本检测模型中,以检测所述简历文档中的各个文字区域;其中,所述文本检测模型为基于自然场景文本检测模型训练得到;
添加子单元,用于分别在各个所述文字区域外添加一个标记框;
识别子单元,用于基于图像识别技术识别每个所述标记框,并通过文字识别模型对各个所述标记框中的文字内容进行文字识别,以识别到各个所述标记框中的文字信息,并将识别到的各所述文字信息分别作为一个语句;
构建子单元,用于基于预设的词嵌入模型构建每一个语句中每一个字符对应的字向量。
在一实施例中,所述第三计算单元40,包括:
第四计算子单元,用于根据所述查询向量与所述键向量,基于对应的权重矩阵计算参数计算得到权重矩阵;
第五计算子单元,用于根据所述查询向量与所述键向量,基于对应的高斯偏差矩阵计算参数计算得到所述高斯偏差矩阵;
加和子单元,用于将所述权重矩阵与所述高斯偏差矩阵进行加和,并进行归一化处理后得到所述注意力权重矩阵;
调整子单元,用于将所述注意力权重矩阵与所述值向量进行相乘计算,以对所述注意力权重矩阵进行调整。
在本实施例中,上述单元/子单元的具体实现请参照上述方法实施例中对应部分,在此不再进行赘述。
参照图3,本申请实施例中还提供一种计算机设备,该计算机设备可以是服务器,其内部结构可以如图3所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设计的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储文本数据、训练数据等。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种命名实体标注方法,包括以下步骤:
获取简历文本中的语句,并构建所述语句的字向量;
通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
本领域技术人员可以理解,图3中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定。
本申请一实施例还提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现一种命名实体标注方法,包括以下步骤:
获取简历文本中的语句,并构建所述语句的字向量;
通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字 符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
可以理解的是,本实施例中的计算机可读存储介质可以是易失性可读存储介质,也可以为非易失性可读存储介质。
综上所述,为本申请实施例中提供的命名实体标注方法、装置、计算机设备和存储介质,包括:获取简历文本中的语句,并构建所述语句的字向量;通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。本申请引入了可学习的高斯偏差矩阵的的加权,引入局部范围的中心位置和移动窗口来计算高斯偏差放入softmax函数中以修正局部强化的权重分布,增强了捕获局部上下文的能力。
以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其它相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种命名实体标注方法,其中,包括以下步骤:
    获取简历文本中的语句,并构建所述语句的字向量;
    通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
    基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
    根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
    基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
  2. 根据权利要求1所述的命名实体标注方法,其中,所述基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量的步骤,包括:
    基于所述TextCNN模型的全连接层中预先训练得到的查询向量计算参数,对所述字向量矩阵进行计算,得到所述查询向量;
    基于所述TextCNN模型的全连接层中预先训练得到的键向量计算参数,对所述字向量矩阵进行计算,得到所述键向量;
    基于所述TextCNN模型的全连接层中预先训练得到的值向量计算参数,对所述字向量矩阵进行计算,得到所述值向量。
  3. 根据权利要求1所述的命名实体标注方法,其中,所述基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注的步骤之后,包括:
    将所述语句中的每个字符添加分类得到的所述命名实体标注,生成第一训练样本;
    对所述第一训练样本进行放回抽样,得到多组训练样本集;并基于每一组训练样本集分别对一个初始TextCNN模型进行训练,得到对应个数的TextCNN子模型;
    将同一个无标注简历文本输入至所有的所述TextCNN子模型中,以输出每一个所述TextCNN子模型预测的命名实体标注结果;
    判断所有所述TextCNN子模型预测的命名实体标注结果是否相同,若相同,则验证所述TextCNN子模型训练完成,以及验证所述语句中的每个字符的第一命名实体标注正确。
  4. 根据权利要求1所述的命名实体标注方法,其中,所述获取简历文本中的语句,并构建所述语句的字向量的步骤,包括:
    获取简历文档;
    将所述简历文档输入至预设的文本检测模型中,以检测所述简历文档中的各个文字区 域;其中,所述文本检测模型为基于自然场景文本检测模型训练得到;
    分别在各个所述文字区域外添加一个标记框;
    基于图像识别技术识别每个所述标记框,并通过文字识别模型对各个所述标记框中的文字内容进行文字识别,以识别到各个所述标记框中的文字信息,并将识别到的各所述文字信息分别作为一个语句;
    基于预设的词嵌入模型构建每一个语句中每一个字符对应的字向量。
  5. 根据权利要求1所述的命名实体标注方法,其中,所述根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整的步骤,包括:
    根据所述查询向量与所述键向量,基于对应的权重矩阵计算参数计算得到权重矩阵;
    根据所述查询向量与所述键向量,基于对应的高斯偏差矩阵计算参数计算得到所述高斯偏差矩阵;
    将所述权重矩阵与所述高斯偏差矩阵进行加和,并进行归一化处理后得到所述注意力权重矩阵;
    将所述注意力权重矩阵与所述值向量进行相乘计算,以对所述注意力权重矩阵进行调整。
  6. 一种命名实体标注装置,其中,包括:
    获取单元,用于获取简历文本中的语句,并构建所述语句的字向量;
    第一计算单元,用于通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
    第二计算单元,用于基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
    第三计算单元,用于根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
    分类单元,用于基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
  7. 根据权利要求6所述的命名实体标注装置,其中,所述第二计算单元,包括:
    第一计算子单元,用于基于所述TextCNN模型的全连接层中预先训练得到的查询向量计算参数,对所述字向量矩阵进行计算,得到所述查询向量;
    第二计算子单元,用于基于所述TextCNN模型的全连接层中预先训练得到的键向量计算参数,对所述字向量矩阵进行计算,得到所述键向量;
    第三计算子单元,用于基于所述TextCNN模型的全连接层中预先训练得到的值向量计算参数,对所述字向量矩阵进行计算,得到所述值向量。
  8. 根据权利要求6所述的命名实体标注装置,其中,所述命名实体标注装置,还包括:
    生成单元,用于将所述语句中的每个字符添加分类得到的所述命名实体标注,生成第一训练样本;
    训练单元,用于对所述第一训练样本进行放回抽样,得到多组训练样本集;并基于每一组训练样本集分别对一个初始TextCNN模型进行训练,得到对应个数的TextCNN子模型;
    输出单元,用于将同一个无标注简历文本输入至所有的所述TextCNN子模型中,以输出每一个所述TextCNN子模型预测的命名实体标注结果;
    验证单元,用于判断所有所述TextCNN子模型预测的命名实体标注结果是否相同,若相同,则验证所述TextCNN子模型训练完成,以及验证所述语句中的每个字符的第一命名实体标注正确。
  9. 根据权利要求6所述的命名实体标注装置,其中,所述获取单元,包括:
    获取子单元,用于获取简历文档;
    检测子单元,用于将所述简历文档输入至预设的文本检测模型中,以检测所述简历文档中的各个文字区域;其中,所述文本检测模型为基于自然场景文本检测模型训练得到;
    添加子单元,用于分别在各个所述文字区域外添加一个标记框;
    识别子单元,用于基于图像识别技术识别每个所述标记框,并通过文字识别模型对各个所述标记框中的文字内容进行文字识别,以识别到各个所述标记框中的文字信息,并将识别到的各所述文字信息分别作为一个语句;
    构建子单元,用于基于预设的词嵌入模型构建每一个语句中每一个字符对应的字向量。
  10. 根据权利要求6所述的命名实体标注装置,其中,所述第三计算单元,包括:
    第四计算子单元,用于根据所述查询向量与所述键向量,基于对应的权重矩阵计算参数计算得到权重矩阵;
    第五计算子单元,用于根据所述查询向量与所述键向量,基于对应的高斯偏差矩阵计算参数计算得到所述高斯偏差矩阵;
    加和子单元,用于将所述权重矩阵与所述高斯偏差矩阵进行加和,并进行归一化处理后得到所述注意力权重矩阵;
    调整子单元,用于将所述注意力权重矩阵与所述值向量进行相乘计算,以对所述注意力权重矩阵进行调整。
  11. 一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机程序,其中,所述处理器执行所述计算机程序时实现一种命名实体标注方法,包括以下步骤:
    获取简历文本中的语句,并构建所述语句的字向量;
    通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
    基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
    根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
    基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
  12. 根据权利要求11所述的计算机设备,其中,所述基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量的步骤,包括:
    基于所述TextCNN模型的全连接层中预先训练得到的查询向量计算参数,对所述字向量矩阵进行计算,得到所述查询向量;
    基于所述TextCNN模型的全连接层中预先训练得到的键向量计算参数,对所述字向量矩阵进行计算,得到所述键向量;
    基于所述TextCNN模型的全连接层中预先训练得到的值向量计算参数,对所述字向量矩阵进行计算,得到所述值向量。
  13. 根据权利要求11所述的计算机设备,其中,所述基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注的步骤之后,包括:
    将所述语句中的每个字符添加分类得到的所述命名实体标注,生成第一训练样本;
    对所述第一训练样本进行放回抽样,得到多组训练样本集;并基于每一组训练样本集分别对一个初始TextCNN模型进行训练,得到对应个数的TextCNN子模型;
    将同一个无标注简历文本输入至所有的所述TextCNN子模型中,以输出每一个所述TextCNN子模型预测的命名实体标注结果;
    判断所有所述TextCNN子模型预测的命名实体标注结果是否相同,若相同,则验证所述TextCNN子模型训练完成,以及验证所述语句中的每个字符的第一命名实体标注正确。
  14. 根据权利要求11所述的计算机设备,其中,所述获取简历文本中的语句,并构建所述语句的字向量的步骤,包括:
    获取简历文档;
    将所述简历文档输入至预设的文本检测模型中,以检测所述简历文档中的各个文字区域;其中,所述文本检测模型为基于自然场景文本检测模型训练得到;
    分别在各个所述文字区域外添加一个标记框;
    基于图像识别技术识别每个所述标记框,并通过文字识别模型对各个所述标记框中的文字内容进行文字识别,以识别到各个所述标记框中的文字信息,并将识别到的各所述文字信息分别作为一个语句;
    基于预设的词嵌入模型构建每一个语句中每一个字符对应的字向量。
  15. 根据权利要求11所述的计算机设备,其中,所述根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所 述值向量对所述注意力权重矩阵进行调整的步骤,包括:
    根据所述查询向量与所述键向量,基于对应的权重矩阵计算参数计算得到权重矩阵;
    根据所述查询向量与所述键向量,基于对应的高斯偏差矩阵计算参数计算得到所述高斯偏差矩阵;
    将所述权重矩阵与所述高斯偏差矩阵进行加和,并进行归一化处理后得到所述注意力权重矩阵;
    将所述注意力权重矩阵与所述值向量进行相乘计算,以对所述注意力权重矩阵进行调整。
  16. 一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时实现一种命名实体标注方法,包括以下步骤:
    获取简历文本中的语句,并构建所述语句的字向量;
    通过预先训练得到的TextCNN模型的多层卷积层对所述字向量进行多层卷积运算,得到字向量矩阵;
    基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量;
    根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整;
    基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述基于所述TextCNN模型的全连接层,对所述字向量矩阵进行计算,得到查询向量、键向量、值向量的步骤,包括:
    基于所述TextCNN模型的全连接层中预先训练得到的查询向量计算参数,对所述字向量矩阵进行计算,得到所述查询向量;
    基于所述TextCNN模型的全连接层中预先训练得到的键向量计算参数,对所述字向量矩阵进行计算,得到所述键向量;
    基于所述TextCNN模型的全连接层中预先训练得到的值向量计算参数,对所述字向量矩阵进行计算,得到所述值向量。
  18. 根据权利要求16所述的计算机可读存储介质,其中,所述基于所述字向量矩阵、调整后的所述注意力权重矩阵,通过所述TextCNN模型进行全连接层处理后再输入至融合高斯误差的softmax分类层中进行分类,得到所述语句中的每个字符的第一命名实体标注的步骤之后,包括:
    将所述语句中的每个字符添加分类得到的所述命名实体标注,生成第一训练样本;
    对所述第一训练样本进行放回抽样,得到多组训练样本集;并基于每一组训练样本集分别对一个初始TextCNN模型进行训练,得到对应个数的TextCNN子模型;
    将同一个无标注简历文本输入至所有的所述TextCNN子模型中,以输出每一个所述TextCNN子模型预测的命名实体标注结果;
    判断所有所述TextCNN子模型预测的命名实体标注结果是否相同,若相同,则验证所述TextCNN子模型训练完成,以及验证所述语句中的每个字符的第一命名实体标注正确。
  19. 根据权利要求16所述的计算机可读存储介质,其中,所述获取简历文本中的语句,并构建所述语句的字向量的步骤,包括:
    获取简历文档;
    将所述简历文档输入至预设的文本检测模型中,以检测所述简历文档中的各个文字区域;其中,所述文本检测模型为基于自然场景文本检测模型训练得到;
    分别在各个所述文字区域外添加一个标记框;
    基于图像识别技术识别每个所述标记框,并通过文字识别模型对各个所述标记框中的文字内容进行文字识别,以识别到各个所述标记框中的文字信息,并将识别到的各所述文字信息分别作为一个语句;
    基于预设的词嵌入模型构建每一个语句中每一个字符对应的字向量。
  20. 根据权利要求16所述的计算机可读存储介质,其中,所述根据所述查询向量、键向量,并结合高斯偏差矩阵,计算得到所述语句中的每两个字符之间的注意力权重矩阵,并基于所述值向量对所述注意力权重矩阵进行调整的步骤,包括:
    根据所述查询向量与所述键向量,基于对应的权重矩阵计算参数计算得到权重矩阵;
    根据所述查询向量与所述键向量,基于对应的高斯偏差矩阵计算参数计算得到所述高斯偏差矩阵;
    将所述权重矩阵与所述高斯偏差矩阵进行加和,并进行归一化处理后得到所述注意力权重矩阵;
    将所述注意力权重矩阵与所述值向量进行相乘计算,以对所述注意力权重矩阵进行调整。
PCT/CN2020/118522 2020-04-24 2020-09-28 命名实体标注方法、装置、计算机设备和存储介质 WO2021212749A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010333674.9 2020-04-24
CN202010333674.9A CN111651992A (zh) 2020-04-24 2020-04-24 命名实体标注方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2021212749A1 true WO2021212749A1 (zh) 2021-10-28

Family

ID=72352510

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118522 WO2021212749A1 (zh) 2020-04-24 2020-09-28 命名实体标注方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN111651992A (zh)
WO (1) WO2021212749A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114048730A (zh) * 2021-11-05 2022-02-15 光大科技有限公司 一种分词和实体联合识别模型训练方法及装置
CN114496115A (zh) * 2022-04-18 2022-05-13 北京白星花科技有限公司 实体关系的标注自动生成方法和系统
CN114580424A (zh) * 2022-04-24 2022-06-03 之江实验室 一种用于法律文书的命名实体识别的标注方法和装置
CN115564393A (zh) * 2022-10-24 2023-01-03 深圳今日人才信息科技有限公司 一种基于招聘需求相似度的职位推荐方法
CN116030014A (zh) * 2023-01-06 2023-04-28 浙江伟众科技有限公司 空调软硬管的智能化加工方法及其系统
CN116611439A (zh) * 2023-07-19 2023-08-18 北京惠每云科技有限公司 医疗信息抽取方法、装置、电子设备及存储介质

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111651992A (zh) * 2020-04-24 2020-09-11 平安科技(深圳)有限公司 命名实体标注方法、装置、计算机设备和存储介质
CN112580822B (zh) * 2020-12-16 2023-10-17 北京百度网讯科技有限公司 机器学习模型的对抗训练方法装置、电子设备和介质
CN112580628B (zh) * 2020-12-22 2023-08-01 浙江智慧视频安防创新中心有限公司 基于注意力机制的车牌字符识别方法及系统
CN112784015B (zh) * 2021-01-25 2024-03-12 北京金堤科技有限公司 信息识别方法和装置、设备、介质和程序
CN113312477A (zh) * 2021-04-19 2021-08-27 上海快确信息科技有限公司 一种基于图注意力的半结构文本分类方案
CN113051897B (zh) * 2021-05-25 2021-09-10 中国电子科技集团公司第三十研究所 一种基于Performer结构的GPT2文本自动生成方法
CN113282707B (zh) * 2021-05-31 2024-01-26 平安国际智慧城市科技股份有限公司 基于Transformer模型的数据预测方法、装置、服务器及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190080225A1 (en) * 2017-09-11 2019-03-14 Tata Consultancy Services Limited Bilstm-siamese network based classifier for identifying target class of queries and providing responses thereof
CN110222188A (zh) * 2019-06-18 2019-09-10 深圳司南数据服务有限公司 一种多任务学习的公司公告处理方法及服务端
CN110298043A (zh) * 2019-07-03 2019-10-01 吉林大学 一种车辆命名实体识别方法及系统
CN110502738A (zh) * 2018-05-18 2019-11-26 阿里巴巴集团控股有限公司 中文命名实体识别方法、装置、设备和查询系统
CN111651992A (zh) * 2020-04-24 2020-09-11 平安科技(深圳)有限公司 命名实体标注方法、装置、计算机设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190080225A1 (en) * 2017-09-11 2019-03-14 Tata Consultancy Services Limited Bilstm-siamese network based classifier for identifying target class of queries and providing responses thereof
CN110502738A (zh) * 2018-05-18 2019-11-26 阿里巴巴集团控股有限公司 中文命名实体识别方法、装置、设备和查询系统
CN110222188A (zh) * 2019-06-18 2019-09-10 深圳司南数据服务有限公司 一种多任务学习的公司公告处理方法及服务端
CN110298043A (zh) * 2019-07-03 2019-10-01 吉林大学 一种车辆命名实体识别方法及系统
CN111651992A (zh) * 2020-04-24 2020-09-11 平安科技(深圳)有限公司 命名实体标注方法、装置、计算机设备和存储介质

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114048730A (zh) * 2021-11-05 2022-02-15 光大科技有限公司 一种分词和实体联合识别模型训练方法及装置
CN114496115A (zh) * 2022-04-18 2022-05-13 北京白星花科技有限公司 实体关系的标注自动生成方法和系统
CN114496115B (zh) * 2022-04-18 2022-08-23 北京白星花科技有限公司 实体关系的标注自动生成方法和系统
CN114580424A (zh) * 2022-04-24 2022-06-03 之江实验室 一种用于法律文书的命名实体识别的标注方法和装置
CN114580424B (zh) * 2022-04-24 2022-08-05 之江实验室 一种用于法律文书的命名实体识别的标注方法和装置
CN115564393A (zh) * 2022-10-24 2023-01-03 深圳今日人才信息科技有限公司 一种基于招聘需求相似度的职位推荐方法
CN115564393B (zh) * 2022-10-24 2024-05-10 深圳今日人才信息科技有限公司 一种基于招聘需求相似度的职位推荐方法
CN116030014A (zh) * 2023-01-06 2023-04-28 浙江伟众科技有限公司 空调软硬管的智能化加工方法及其系统
CN116030014B (zh) * 2023-01-06 2024-04-09 浙江伟众科技有限公司 空调软硬管的智能化加工方法及其系统
CN116611439A (zh) * 2023-07-19 2023-08-18 北京惠每云科技有限公司 医疗信息抽取方法、装置、电子设备及存储介质
CN116611439B (zh) * 2023-07-19 2023-09-19 北京惠每云科技有限公司 医疗信息抽取方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN111651992A (zh) 2020-09-11

Similar Documents

Publication Publication Date Title
WO2021212749A1 (zh) 命名实体标注方法、装置、计算机设备和存储介质
US11631007B2 (en) Method and device for text-enhanced knowledge graph joint representation learning
CN108733792B (zh) 一种实体关系抽取方法
CN107729309B (zh) 一种基于深度学习的中文语义分析的方法及装置
CN110569508A (zh) 融合词性和自注意力机制的情感倾向性分类方法及系统
CN111914558A (zh) 基于句袋注意力远程监督的课程知识关系抽取方法及系统
CN108287911B (zh) 一种基于约束化远程监督的关系抽取方法
CN111522965A (zh) 一种基于迁移学习的实体关系抽取的问答方法及系统
CN113591483A (zh) 一种基于序列标注的文档级事件论元抽取方法
CN113627190A (zh) 可视化数据转换方法、装置、计算机设备及存储介质
CN113742733A (zh) 阅读理解漏洞事件触发词抽取和漏洞类型识别方法及装置
CN114417851A (zh) 一种基于关键词加权信息的情感分析方法
WO2023093525A1 (zh) 模型训练方法、中文文本纠错方法、电子设备和存储介质
CN114416979A (zh) 一种文本查询方法、设备和存储介质
CN113051922A (zh) 一种基于深度学习的三元组抽取方法及系统
Rizvi et al. Deep extreme learning machine-based optical character recognition system for nastalique urdu-like script languages
CN114048314A (zh) 一种自然语言隐写分析方法
CN113761151A (zh) 同义词挖掘、问答方法、装置、计算机设备和存储介质
CN112699685A (zh) 基于标签引导的字词融合的命名实体识别方法
CN110929013A (zh) 一种基于bottom-up attention和定位信息融合的图片问答实现方法
Jiang et al. Multilingual interoperation in cross-country industry 4.0 system for one belt and one road
CN114169447B (zh) 基于自注意力卷积双向门控循环单元网络的事件检测方法
CN114417872A (zh) 一种合同文本命名实体识别方法及系统
CN113989811A (zh) 基于深度学习的贸易合同中项目公司、供应商的提取方法
Wu et al. A text emotion analysis method using the dual-channel convolution neural network in social networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20932272

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20932272

Country of ref document: EP

Kind code of ref document: A1