WO2021072852A1 - Sequence labeling method and system, and computer device - Google Patents

Sequence labeling method and system, and computer device Download PDF

Info

Publication number
WO2021072852A1
WO2021072852A1 PCT/CN2019/117403 CN2019117403W WO2021072852A1 WO 2021072852 A1 WO2021072852 A1 WO 2021072852A1 CN 2019117403 W CN2019117403 W CN 2019117403W WO 2021072852 A1 WO2021072852 A1 WO 2021072852A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
word
target text
vector
labeling
Prior art date
Application number
PCT/CN2019/117403
Other languages
French (fr)
Chinese (zh)
Inventor
金戈
徐亮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021072852A1 publication Critical patent/WO2021072852A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the embodiments of the present application relate to the field of sequence labeling, and in particular, to a sequence labeling method, system, computer equipment, and non-volatile computer-readable storage medium.
  • named entity recognition is the most basic and most widely used one. It refers to identifying entities with specific meanings in the text, including names of persons, places, organizations, proper nouns, etc.
  • the named entity recognition application is an important basic tool for other applications such as information extraction, question answering system, syntactic analysis, machine translation, semantic web-oriented metadata annotation and other application fields.
  • a natural language model can be constructed, which can understand, analyze and answer the results of natural language like humans.
  • existing models often fail to consider long-term contextual information, which results in a technical problem that limits the accuracy of recognition.
  • the embodiment of the present application provides a sequence labeling method, and the method steps include:
  • the first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding word belonging to n first The first probability of each first tag in the tags;
  • the first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, wherein the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
  • a named entity sequence is generated according to the tag sequence, and the named entity sequence is output.
  • an embodiment of the present application also provides a sequence labeling system, including:
  • a receiving text module for receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word;
  • the first labeling module is used to input the sentence vector of the target text sequence, the word vector of each word and the position vector of each word into the trained BERT model, and the output of the BERT model corresponds to the target text sequence
  • the first tagging sequence wherein the first tagging sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding The first probability of the word belonging to each of the n first tags;
  • the second labeling module is configured to input the first labeling sequence to the fully connected layer, and output a second labeling sequence through the fully connected layer, wherein the second labeling sequence includes a plurality of second n-dimensional vectors, each A second n-dimensional vector corresponds to a word in the target text sequence, and the second n-dimensional vector represents a second probability of the corresponding word belonging to each of the n second tags;
  • the output entity module is used to generate a named entity sequence according to the tag sequence, and output the named entity sequence.
  • an embodiment of the present application further provides a computer device, the computer device including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, the When the computer-readable instructions are executed by the processor, the following steps are implemented:
  • the first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding word belonging to n first The first probability of each first tag in the tags;
  • the first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, wherein the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
  • a named entity sequence is generated according to the tag sequence, and the named entity sequence is output.
  • the embodiments of the present application also provide a non-volatile computer-readable storage medium, the non-volatile computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions may Is executed by at least one processor, so that the at least one processor executes the following steps:
  • the first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding word belonging to n first The first probability of each first tag in the tags;
  • the first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, wherein the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
  • a named entity sequence is generated according to the tag sequence, and the named entity sequence is output.
  • the sequence labeling method, system, computer equipment, and non-volatile computer-readable storage medium provided by the embodiments of the application provide an effective sequence labeling method for text sequences; the embodiments of the application solve the problem that the model in the prior art cannot consider the long-term Contextual information, which limits the accuracy of recognition technology. It is possible to extract the named entities in the sentence by directly inputting the original sentence into the model. It has strong adaptability and wide application, which improves the accuracy of sequence labeling for entity recognition. The technical effect of the rate.
  • FIG. 1 is a schematic flowchart of a sequence labeling method according to an embodiment of the application.
  • Fig. 2 is a schematic diagram of program modules of the second embodiment of the sequence labeling system of this application.
  • FIG. 3 is a schematic diagram of the hardware structure of the third embodiment of the computer equipment of this application.
  • the computer device 2 will be used as an execution subject for exemplary description.
  • FIG. 1 shows a flowchart of the steps of a sequence labeling method according to an embodiment of the present application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps.
  • the following is an exemplary description with the computer device 2 as the execution subject. details as follows.
  • Step S100 Receive a target text sequence, and convert the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word.
  • the step S100 may further include:
  • Step S100a input the target text sequence to an embedding layer, and output a plurality of word vectors corresponding to the target text sequence through the embedding layer, and the plurality of word vectors includes at least one punctuation vector.
  • Step S100b Input the plurality of word vectors into the segmentation layer, and divide the plurality of word vectors according to the at least one punctuation vector to obtain n word vector sets, and the n word vector sets correspond to n divisions code.
  • the target text sequence is [Curie was born in Poland, lives in the United States] is divided into A sentence [Curie is born in Poland] and B sentence [residence in the United States], the first half of the sentence will be added with the segmentation code A, and then The segmentation code B will be added to the half sentence.
  • step S100c an encoding operation is performed on each segmentation code by position encoding, and the position information encoding of each segmentation code is determined, so as to obtain the position vector of each word in the target text sequence.
  • the position information encoding may be used to determine the position of each word in the target text sequence.
  • Step S100d generating a sentence vector of the target text sequence according to the word vector of each word in the target text sequence and the position vector of each word.
  • Step S102 Input the sentence vector of the target text sequence, the word vector of each word and the position vector of each word into the trained BERT model, and output the first annotation corresponding to the target text sequence through the BERT model Sequence, wherein the first annotation sequence includes a plurality of first n-dimensional vectors, each of the first n-dimensional vectors corresponds to a word in the target text sequence, and the first n-dimensional vector indicates that the corresponding word belongs to n The first probability of each first tag in the first tags.
  • the n first tags may be multiple location tags and multiple semantic tags, and the n first tags may also be multiple location tags and multiple part-of-speech tags.
  • BERT is an existing pre-training model.
  • the full name of BERT is Bidirectional Encoder Representations from Transformers, that is, a two-way Transformer encoder (Encoder); wherein, the Transformer is a type that completely relies on self-attention.
  • Encoder a two-way Transformer encoder
  • BERT aims to pre-train deep bidirectional representations by jointly adjusting the context in all layers. Therefore, the pre-trained BERT can be fine-tuned through an additional output layer, which is suitable for the construction of state-of-the-art models for a wide range of tasks, such as question answering tasks and language inference, without requiring major architectural modifications for specific tasks.
  • the BERT model can be obtained by capturing words through a masked language model (MLM) method and expressing the sentence level through a "Next Sentence Prediction” method; wherein the masked language model randomly masks the model input Some of the words in the (token), the goal is to predict the original vocabulary id based only on the context of the masked word.
  • MLM masked language model
  • the training target of the masked language model allows the representation and fusion of the left and right sides of the language Context, so as to pre-train a deep two-way Transformer;
  • Next Sentence Prediction that is, Next Sentence Prediction refers to the selection of two sentences in two situations when pre-training the language model.
  • training the pre-trained BERT model may include: acquiring multiple training text sequences, using the multiple training text sequences as a training set of the BERT model, and inputting the training set to the pre-trained BERT In the model, the pre-trained BERT model is trained through the training set to obtain a trained BERT model.
  • the step S102 may further include:
  • Step S102a Perform feature extraction on the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word through the BERT model to obtain the first label of each word in the target text sequence. First probability.
  • Step S102b Generate a first labeling sequence according to the first probability of each first tag of each word in the target text sequence.
  • Step S104 Input the first labeling sequence to the fully connected layer, and output a second labeling sequence through the fully connected layer, where the second labeling sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector The dimensional vector corresponds to a word in the target text sequence, and the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags.
  • the n second tags may be multiple location tags and multiple semantic tags, and the n second tags may also be multiple location tags and multiple part-of-speech tags.
  • the step S104 may further include:
  • Step S104a Input the first label sequence into the neural network structure of the fully connected layer, and perform additional feature extraction to obtain the second probability of each label of each word in the target text sequence.
  • additional features of the i-th word sequence extracted calculation formula B i wX i + b, where, X i is the probability of each of the first tag of the first sequence of the first tagging word i, w and b is the learning parameter of the BERT model;
  • the neural network structure of the fully connected layer of this embodiment may be a multi-layer transformer structure.
  • the multi-layer transformer structure further includes an attention mechanism. After the first annotation sequence is processed by the attention mechanism, the input To the feedforward fully connected neural network structure for additional feature extraction, to obtain the second probability of each second label of each word in the target text sequence; that is, to obtain each of the target text sequences through the operation of wx+b The second probability of each second label of the word, where x is the sequence, and w and b are the model learning parameters.
  • Step S104b Generate a second tagging sequence according to the second probability of each second tag of each word in the target text sequence.
  • the step S106 may further include:
  • Step S106a input the second annotation sequence into the CRF model
  • Step S106b Viterbi solving the second annotation sequence by the Viterbi algorithm to obtain an optimal solution path in the second annotation sequence, where the optimal solution path is that the label sequence is an integer.
  • this step is to determine the output object that the target text sequence should correspond to according to the probability value of the second probability of each second label of each word in the target text sequence; here it is implemented by the Viterbi algorithm, so The Viterbi algorithm does not output the highest label probability among the second probabilities of each second label of each word in the target text sequence, but outputs the highest probability label sequence of the entire target text sequence.
  • the Viterbi algorithm may include: when the path with the second highest probability of each second tag of each word in the target text sequence passes through a certain point of the fence network, then from the starting point to the The sub-path of a point must also be the path with the greatest probability from the beginning to the point; when there are k states at the i-th moment, there are k shortest paths from the beginning to the k states at the i time, and the final shortest path must pass One of them.
  • Step S106c generating a label sequence according to the optimal solution path.
  • the highest probability labeling sequence of the entire target text sequence is calculated by the Viterbi algorithm.
  • the shortest path of the i+1th state only the k state values from the beginning to the current The shortest path and the shortest path from the current state value to the i+1th state value are sufficient.
  • Step S108 Generate a named entity sequence according to the tag sequence, and output the named entity sequence.
  • a named entity sequence can be generated according to the tag sequence, and the named entity sequence is a target text sequence predicted by the tagging system.
  • the named entity includes place name, person name, etc.; sequence labeling adopts the form of BIOES, where B is the beginning of the entity, I is the middle of the entity, O is the non-entity, E is the end of the entity, and S is the single word entity; and each named entity The label corresponds to the entity category, which can be refined into similar forms such as B-place name: the beginning of the place name entity.
  • B-place name the beginning of the place name entity.
  • the sentence "Curie was born in Warsaw" as an example, this sentence will be split into a sequence of words.
  • the home character is marked as B name
  • the inner character is marked as E name
  • the new character is marked as O
  • the word Yu is marked as O
  • the wave character is marked as B place name
  • the blue character is marked as E- place name.
  • Fig. 2 is a schematic diagram of program modules of the second embodiment of the sequence labeling system of this application.
  • the sequence labeling system 20 may include or be divided into one or more program modules.
  • One or more program modules are stored in a storage medium and executed by one or more processors to complete the application and realize the above Sequence labeling method.
  • the program module referred to in the embodiments of the present application refers to a series of computer-readable instruction segments that can complete specific functions. The following description will specifically introduce the functions of each program module in this embodiment:
  • the receiving text module 200 is configured to receive a target text sequence, and convert the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word.
  • the receiving text module 200 is further configured to: input the target text sequence into an embedding layer, and output multiple word vectors corresponding to the target text sequence through the embedding layer, the multiple word vectors Including at least one punctuation vector; input the plurality of word vectors to the segmentation layer, and divide the plurality of word vectors according to the at least one punctuation vector to obtain n word vector sets, the n word vector sets Corresponding to n segmentation codes; perform an encoding operation on each segmentation code by position encoding, and determine the position information encoding of each segmentation code to obtain the position vector of each word in the target text sequence; and according to the target text sequence The word vector of each word in and the position vector of each word in the, generate the sentence vector of the target text sequence.
  • the first labeling module 202 is configured to input the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word into the trained BERT model, and the output of the BERT model is related to the target text sequence.
  • the corresponding first annotation sequence wherein the first annotation sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents The first probability that the corresponding word belongs to each of the n first tags.
  • the first tagging module 202 is further configured to: use the BERT model to perform feature extraction on the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word to obtain the target The first probability of each first label of each word in the text sequence; and the first labeling sequence is generated according to the first probability of each first label of each word in the target text sequence.
  • the second labeling module 204 is configured to input the first labeling sequence to the fully connected layer, and output a second labeling sequence through the fully connected layer, wherein the second labeling sequence includes a plurality of second n-dimensional vectors, Each second n-dimensional vector corresponds to a word in the target text sequence, and the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags.
  • the second labeling module 204 is further configured to: input the first labeling sequence into the neural network structure of the fully connected layer, and perform additional feature extraction to obtain the information of each word in the target text sequence.
  • the first probability of each first label of each word, w and b are BERT model learning parameters; according to the second probability of each second label of each word in the target text sequence, a second labeling sequence is generated.
  • the output tag module 206 is further configured to: input the second annotation sequence into the CRF model; perform Viterbi solution on the second annotation sequence by the Viterbi algorithm to obtain the second annotation
  • the optimal solution path in the sequence wherein the optimal solution path is the highest probability sequence in which the label sequence is the entire target text sequence; the label sequence is generated according to the optimal solution path.
  • the output entity module 208 is configured to generate a named entity sequence according to the tag sequence, and output the named entity sequence.
  • the computer device 2 is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • the computer device 2 may be a rack server, a blade server, a tower server, or a cabinet server (including an independent server or a server cluster composed of multiple servers).
  • the computer device 2 at least includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and a sequence labeling system 20 that can communicate with each other through a system bus.
  • the memory 21 includes at least one type of non-volatile computer-readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), Random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk Wait.
  • the memory 21 may be an internal storage unit of the computer device 2, for example, a hard disk or a memory of the computer device 2.
  • the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc.
  • the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device.
  • the memory 21 is generally used to store an operating system and various application software installed in the computer device 2, for example, the program code of the sequence labeling system 20 in the second embodiment.
  • the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments.
  • the processor 22 is generally used to control the overall operation of the computer device 2.
  • the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the sequence labeling system 20 to implement the sequence labeling method of the first embodiment.
  • the network interface 23 may include a wireless network interface or a wired network interface, and the network interface 23 is generally used to establish a communication connection between the computer device 2 and other electronic devices.
  • the network interface 23 is used to connect the computer device 2 to an external terminal through a network, and to establish a data transmission channel and a communication connection between the computer device 2 and the external terminal.
  • the network may be an intranet, the Internet, a global system of mobile communication (GSM), a wideband code division multiple access (WCDMA), a 4G network, and a 5G Network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
  • FIG. 3 only shows the computer device 2 with components 20-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • sequence labeling system 20 stored in the memory 21 may also be divided into one or more program modules, and the one or more program modules are stored in the memory 21 and processed by one or more The processor (in this embodiment, the processor 22) is executed to complete the application.
  • FIG. 2 shows a schematic diagram of program modules for implementing the sequence labeling system 20 according to the second embodiment of the present application.
  • the sequence labeling system 20 can be divided into a text receiving module 200 and a first labeling module 202. , The second labeling module 204, the output label module 206, and the output entity module 208.
  • the program module referred to in this application refers to a series of computer-readable instruction segments that can complete specific functions. The specific functions of the program modules 200-208 have been described in detail in the second embodiment, and will not be repeated here.
  • This embodiment also provides a non-volatile computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory ( SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, servers, App application malls, etc., on which storage There are computer-readable instructions, and the corresponding functions are realized when the program is executed by the processor.
  • the non-volatile computer-readable storage medium of this embodiment is used in the sequence labeling system 20, and the processor executes the following steps:
  • the first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding word belonging to n first The first probability of each first tag in the tags;
  • the first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, where the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
  • a named entity sequence is generated according to the tag sequence, and the named entity sequence is output.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

A sequence labeling method, comprising: receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word and a position vector of each word (S100); inputting the sentence vector of the target text sequence, the word vector of each word and the position vector of each word into a trained BERT model, and outputting, by means of the BERT model, a first labeling sequence corresponding to the target text sequence; inputting the first labeling sequence into a fully connected layer, and outputting a second labeling sequence by means of the fully connected layer; taking the second labeling sequence as an input sequence of a conditional random field (CRF) model so as to output a label sequence Y = (y1, y2,..., ym) by means of the CRF model (S106); and generating a named entity sequence according to the label sequence, and outputting the named entity sequence (S108). According to the method, the problem of existing models being unable to consider a long-term context information relationship is solved, and thus the technical effects of directly extracting a named entity in a text by means of the model, and improving the accuracy of identifying an entity are realized.

Description

序列标注方法、系统和计算机设备Sequence labeling method, system and computer equipment
本申请申明2019年10月16日递交的申请号为201910983279.2、名称为“序列标注方法、系统和计算机设备”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。This application affirms the priority of the Chinese patent application filed on October 16, 2019 with the application number 201910983279.2. entitled "Sequence labeling method, system and computer equipment". The entire content of the Chinese patent application is incorporated herein by reference. Applying.
技术领域Technical field
本申请实施例涉及序列标注领域,尤其涉及一种序列标注方法、系统、计算机设备及非易失性计算机可读存储介质。The embodiments of the present application relate to the field of sequence labeling, and in particular, to a sequence labeling method, system, computer equipment, and non-volatile computer-readable storage medium.
背景技术Background technique
在所有的自然语言处理应用中,命名实体识别是最基本也是应用最广泛的一种。它是指识别文中具有特定意义的实体,主要包括人名、地名、机构名、专有名词等。命名实体识别这一应用是后续的其他应用如信息提取、问答系统、句法分析、机器翻译、面向semantic web的元数据标注等应用领域的重要基础工具。通过对命名实体识别这一工具的应用,可以构建一种自然语言模型,这种语言模型可以像人类那样理解、分析并回答自然语言的结果。但是现有模型往往无法考虑长远的上下文信息,从而使得识别的准确率受到限制的技术问题。Among all natural language processing applications, named entity recognition is the most basic and most widely used one. It refers to identifying entities with specific meanings in the text, including names of persons, places, organizations, proper nouns, etc. The named entity recognition application is an important basic tool for other applications such as information extraction, question answering system, syntactic analysis, machine translation, semantic web-oriented metadata annotation and other application fields. Through the application of the named entity recognition tool, a natural language model can be constructed, which can understand, analyze and answer the results of natural language like humans. However, existing models often fail to consider long-term contextual information, which results in a technical problem that limits the accuracy of recognition.
因此,如何解决现有模型无法考虑长远的上下文信息关系的问题,从而进一步提高序列标注的识别准确率,成为了当前要解决的技术问题之一。Therefore, how to solve the problem that the existing model cannot consider the long-term contextual information relationship, so as to further improve the recognition accuracy of sequence labeling, has become one of the current technical problems to be solved.
发明内容Summary of the invention
有鉴于此,有必要提供一种序列标注方法、系统、计算机设备及非易失性计算机可读存储介质,以解决现有模型无法考虑长远的上下文信息关系,从而使得序列标注的识别准确率受到限制等技术问题。In view of this, it is necessary to provide a sequence labeling method, system, computer equipment, and non-volatile computer-readable storage medium to solve the problem that the existing model cannot consider the long-term contextual information relationship, so that the recognition accuracy of sequence labeling is affected. Technical issues such as restrictions.
为实现上述目的,本申请实施例提供了序列标注方法,所述方法步骤包括:In order to achieve the foregoing objective, the embodiment of the present application provides a sequence labeling method, and the method steps include:
接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量;Receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word;
将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其 中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字的属于n个第一标签中各个第一标签的第一概率;Input the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word into the trained BERT model, and output the first annotation sequence corresponding to the target text sequence through the BERT model, where , The first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding word belonging to n first The first probability of each first tag in the tags;
将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中,所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率;The first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, wherein the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m);及 Use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model; and
根据所述标签序列生成命名实体序列,并输出所述命名实体序列。A named entity sequence is generated according to the tag sequence, and the named entity sequence is output.
为实现上述目的,本申请实施例还提供了一种序列标注系统,包括:To achieve the foregoing objective, an embodiment of the present application also provides a sequence labeling system, including:
接收文本模块,用于接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量;A receiving text module for receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word;
第一标注模块,用于将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字的属于n个第一标签中各个第一标签的第一概率;The first labeling module is used to input the sentence vector of the target text sequence, the word vector of each word and the position vector of each word into the trained BERT model, and the output of the BERT model corresponds to the target text sequence The first tagging sequence, wherein the first tagging sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding The first probability of the word belonging to each of the n first tags;
第二标注模块,用于将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中,所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率;The second labeling module is configured to input the first labeling sequence to the fully connected layer, and output a second labeling sequence through the fully connected layer, wherein the second labeling sequence includes a plurality of second n-dimensional vectors, each A second n-dimensional vector corresponds to a word in the target text sequence, and the second n-dimensional vector represents a second probability of the corresponding word belonging to each of the n second tags;
输出标签模块,用于将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m);及 The output label module is used to use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model; and
输出实体模块,用于根据所述标签序列生成命名实体序列,并输出所述命名实体序列。The output entity module is used to generate a named entity sequence according to the tag sequence, and output the named entity sequence.
为实现上述目的,本申请实施例还提供了一种计算机设备,所述计算机设备包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:In order to achieve the foregoing objective, an embodiment of the present application further provides a computer device, the computer device including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, the When the computer-readable instructions are executed by the processor, the following steps are implemented:
接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量;Receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word;
将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后 的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字的属于n个第一标签中各个第一标签的第一概率;Input the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word into the trained BERT model, and output the first annotation sequence corresponding to the target text sequence through the BERT model, where , The first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding word belonging to n first The first probability of each first tag in the tags;
将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中,所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率;The first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, wherein the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m);及 Use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model; and
根据所述标签序列生成命名实体序列,并输出所述命名实体序列。A named entity sequence is generated according to the tag sequence, and the named entity sequence is output.
为实现上述目的,本申请实施例还提供了一种非易失性计算机可读存储介质,所述非易失性计算机可读存储介质内存储有计算机可读指令,所述计算机可读指令可被至少一个处理器所执行,以使所述至少一个处理器执行如下步骤:In order to achieve the above objective, the embodiments of the present application also provide a non-volatile computer-readable storage medium, the non-volatile computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions may Is executed by at least one processor, so that the at least one processor executes the following steps:
接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量;Receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word;
将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字的属于n个第一标签中各个第一标签的第一概率;Input the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word into the trained BERT model, and output the first annotation sequence corresponding to the target text sequence through the BERT model, where , The first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding word belonging to n first The first probability of each first tag in the tags;
将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中,所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率;The first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, wherein the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m);及 Use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model; and
根据所述标签序列生成命名实体序列,并输出所述命名实体序列。A named entity sequence is generated according to the tag sequence, and the named entity sequence is output.
本申请实施例提供的序列标注方法、系统、计算机设备及非易失性计算机可读存储介质,为文本序列提供了有效序列标注方法;本申请实施例解决现有技术中的模型无法考虑长远的上下文信息,从而使得识别的准确率受到限制的技术问题,达到了直接将原始语句输入模型就可以提取出语句中的命名实体,适应性强,适用面广,提高了序列标注对实体 识别的准确率的技术效果。The sequence labeling method, system, computer equipment, and non-volatile computer-readable storage medium provided by the embodiments of the application provide an effective sequence labeling method for text sequences; the embodiments of the application solve the problem that the model in the prior art cannot consider the long-term Contextual information, which limits the accuracy of recognition technology. It is possible to extract the named entities in the sentence by directly inputting the original sentence into the model. It has strong adaptability and wide application, which improves the accuracy of sequence labeling for entity recognition. The technical effect of the rate.
附图说明Description of the drawings
图1为本申请实施例序列标注方法的流程示意图。FIG. 1 is a schematic flowchart of a sequence labeling method according to an embodiment of the application.
图2为本申请序列标注系统实施例二的程序模块示意图。Fig. 2 is a schematic diagram of program modules of the second embodiment of the sequence labeling system of this application.
图3为本申请计算机设备实施例三的硬件结构示意图。FIG. 3 is a schematic diagram of the hardware structure of the third embodiment of the computer equipment of this application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions, and advantages of this application clearer, the following further describes this application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
需要说明的是,在本申请中涉及“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。It should be noted that the descriptions related to "first", "second", etc. in this application are only used for descriptive purposes, and cannot be understood as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. . Therefore, the features defined with "first" and "second" may explicitly or implicitly include at least one of the features. In addition, the technical solutions between the various embodiments can be combined with each other, but it must be based on what can be achieved by a person of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be achieved, it should be considered that such a combination of technical solutions does not exist. , Is not within the scope of protection required by this application.
以下实施例中,将以计算机设备2为执行主体进行示例性描述。In the following embodiments, the computer device 2 will be used as an execution subject for exemplary description.
实施例一Example one
参阅图1,示出了本申请实施例之序列标注方法的步骤流程图。可以理解,本方法实施例中的流程图不用于对执行步骤的顺序进行限定。下面以计算机设备2为执行主体进行示例性描述。具体如下。Referring to FIG. 1, it shows a flowchart of the steps of a sequence labeling method according to an embodiment of the present application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps. The following is an exemplary description with the computer device 2 as the execution subject. details as follows.
步骤S100,接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量。Step S100: Receive a target text sequence, and convert the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word.
具体的,所述步骤S100可以进一步包括:Specifically, the step S100 may further include:
步骤S100a,将所述目标文本序列输入到嵌入层,通过所述嵌入层输出与所述目标文本序列对应的多个字向量,所述多个字向量中包括至少一个标点向量。Step S100a, input the target text sequence to an embedding layer, and output a plurality of word vectors corresponding to the target text sequence through the embedding layer, and the plurality of word vectors includes at least one punctuation vector.
示例性的,当接收的目标文本序列为[居里生于波兰,居住在美国]时,则需要将每个字及特殊符号都转化为词嵌入向量,因为神经网络只能进行数值计算。Exemplarily, when the received target text sequence is [Curie was born in Poland and lives in the United States], it is necessary to convert each word and special symbol into a word embedding vector, because the neural network can only perform numerical calculations.
步骤S100b,将所述多个字向量输入到分割层,根据所述至少一个标点向量对所述多个字向量进行分割,得到n个字向量集合,所述n个字向量集合对应n个分割码。Step S100b: Input the plurality of word vectors into the segmentation layer, and divide the plurality of word vectors according to the at least one punctuation vector to obtain n word vector sets, and the n word vector sets correspond to n divisions code.
示例性的,将目标文本序列为[居里生于波兰,居住在美国]分割为A句[居里生于波兰]和B句[居住在美国],前面半句会加上分割码A,后半句会加上分割码B。Exemplarily, the target text sequence is [Curie was born in Poland, lives in the United States] is divided into A sentence [Curie is born in Poland] and B sentence [residence in the United States], the first half of the sentence will be added with the segmentation code A, and then The segmentation code B will be added to the half sentence.
步骤S100c,通过位置编码对每个分割码进行编码运算,确定每个分割码的位置信息编码,以得到所述目标文本序列中每个字的位置向量。In step S100c, an encoding operation is performed on each segmentation code by position encoding, and the position information encoding of each segmentation code is determined, so as to obtain the position vector of each word in the target text sequence.
示例性的,所述位置信息编码可以用于确定所述目标文本序列中每个字的位置。Exemplarily, the position information encoding may be used to determine the position of each word in the target text sequence.
步骤S100d,根据所述目标文本序列中每个字的字向量以及所述每个字的位置向量,生成所述目标文本序列的句子向量。Step S100d, generating a sentence vector of the target text sequence according to the word vector of each word in the target text sequence and the position vector of each word.
步骤S102,将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字的属于n个第一标签中各个第一标签的第一概率。Step S102: Input the sentence vector of the target text sequence, the word vector of each word and the position vector of each word into the trained BERT model, and output the first annotation corresponding to the target text sequence through the BERT model Sequence, wherein the first annotation sequence includes a plurality of first n-dimensional vectors, each of the first n-dimensional vectors corresponds to a word in the target text sequence, and the first n-dimensional vector indicates that the corresponding word belongs to n The first probability of each first tag in the first tags.
示例性的,所述n个第一标签可以是多个位置标签和多个语义标签,所述n个第一标签还可以是是多个位置标签和多个词性标签。Exemplarily, the n first tags may be multiple location tags and multiple semantic tags, and the n first tags may also be multiple location tags and multiple part-of-speech tags.
示例性的,BERT是一种现有预训练模型,所述BERT的全称是Bidirectional Encoder Representations from Transformers,即双向Transformer的编码器(Encoder);其中,所述Transformer是一种完全依赖于自注意力以计算输入与输出表征的方法;BERT旨在通过联合调节所有层中的上下文来预先训练深度双向表示。因此,预训练的BERT表示可以通过一个额外的输出层进行微调,适用于广泛任务的最先进模型的构建,比如问答任务和语言推理,无需针对具体任务做大幅架构修改。Exemplarily, BERT is an existing pre-training model. The full name of BERT is Bidirectional Encoder Representations from Transformers, that is, a two-way Transformer encoder (Encoder); wherein, the Transformer is a type that completely relies on self-attention. In order to calculate input and output representations; BERT aims to pre-train deep bidirectional representations by jointly adjusting the context in all layers. Therefore, the pre-trained BERT can be fine-tuned through an additional output layer, which is suitable for the construction of state-of-the-art models for a wide range of tasks, such as question answering tasks and language inference, without requiring major architectural modifications for specific tasks.
示例性的,所述BERT模型可以通过遮蔽语言模型(masked language model,MLM)方法捕捉词语和通过“下一句预测”(Next Sentence Prediction)方法表示句子级别得到;其中,遮蔽语言模型随机遮蔽模型输入中的一些单词(token),目标在于仅基于遮蔽词的语境来预测其原始词汇id,与从左到右的语言模型预训练不同,遮蔽语言模型的训练目标允许表征融合左右两侧的语境,从而预训练一个深度双向Transformer;“下一句预测”,即Next Sentence Prediction指的是做语言模型预训练的时候,分两种情况选择两个句子,一种是选择语料中真正顺序相连的两个句子;另外一种是第二个句子从语料库中抛色子,随机选择一个拼到第一个句子后面。我们要求模型除了做上述的Masked语言模型任务外,附带再做个句子关系预测,判断第二个句子是不是真的是第一个句子的后续句子。Exemplarily, the BERT model can be obtained by capturing words through a masked language model (MLM) method and expressing the sentence level through a "Next Sentence Prediction" method; wherein the masked language model randomly masks the model input Some of the words in the (token), the goal is to predict the original vocabulary id based only on the context of the masked word. Unlike the pre-training of the left-to-right language model, the training target of the masked language model allows the representation and fusion of the left and right sides of the language Context, so as to pre-train a deep two-way Transformer; “Next Sentence Prediction”, that is, Next Sentence Prediction refers to the selection of two sentences in two situations when pre-training the language model. One is to select the real sequence in the corpus. Two sentences; the other is that the second sentence is thrown dice from the corpus, and one is randomly selected and spelled after the first sentence. We require the model to do the above-mentioned Masked language model task with a sentence relationship prediction to determine whether the second sentence is really a follow-up sentence of the first sentence.
示例性的,对所述预训练的BERT模型的训练可以包括:获取多个训练文本序列,将所述多个训练文本序列作为BERT模型的训练集,将所述训练集输入到预训练的BERT模型中,通过所述训练集对所述预训练的BERT模型进行训练以得到训练后的BERT模型。Exemplarily, training the pre-trained BERT model may include: acquiring multiple training text sequences, using the multiple training text sequences as a training set of the BERT model, and inputting the training set to the pre-trained BERT In the model, the pre-trained BERT model is trained through the training set to obtain a trained BERT model.
具体的,所述步骤S102可以进一步包括:Specifically, the step S102 may further include:
步骤S102a,通过所述BERT模型对所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量进行特征提取,以得到所述目标文本序列中每个字的各个第一标签的第一概率。Step S102a: Perform feature extraction on the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word through the BERT model to obtain the first label of each word in the target text sequence. First probability.
步骤S102b,根据所述目标文本序列中每个字的各个第一标签的第一概率,生成第一标注序列。Step S102b: Generate a first labeling sequence according to the first probability of each first tag of each word in the target text sequence.
步骤S104,将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中,所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率。Step S104: Input the first labeling sequence to the fully connected layer, and output a second labeling sequence through the fully connected layer, where the second labeling sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector The dimensional vector corresponds to a word in the target text sequence, and the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags.
示例性的,所述n个第二标签可以是多个位置标签和多个语义标签,所述n个第二标签还可以是是多个位置标签和多个词性标签。Exemplarily, the n second tags may be multiple location tags and multiple semantic tags, and the n second tags may also be multiple location tags and multiple part-of-speech tags.
具体的,所述步骤S104可以进一步包括:Specifically, the step S104 may further include:
步骤S104a,将所述第一标注序列输入到全连接层的神经网络结构中,进行额外特征提取,以得到所述目标文本序列中每个字的各个标签的第二概率,对于所述目标文本序列中第i个字的额外特征提取的运算公式为B i=wX i+b,其中,X i是所述第一标注序列中第i个字的各个第一标签的第一概率,w与b是BERT模型学习参数; Step S104a: Input the first label sequence into the neural network structure of the fully connected layer, and perform additional feature extraction to obtain the second probability of each label of each word in the target text sequence. For the target text additional features of the i-th word sequence extracted calculation formula B i = wX i + b, where, X i is the probability of each of the first tag of the first sequence of the first tagging word i, w and b is the learning parameter of the BERT model;
示例性的,本实施例的全连接层的神经网络结构可以是多层transformer结构,所述多层transformer结构还包括注意力机制,所述第一标注序经过所述注意力机制处理后,输入到前馈全连接神经网络结构进行额外特征提取,以得到所述目标文本序列中每个字的各个第二标签的第二概率;即通过wx+b的运算得到所述目标文本序列中每个字的各个第二标签的第二概率,其中x是序列,w与b是模型学习参数。Exemplarily, the neural network structure of the fully connected layer of this embodiment may be a multi-layer transformer structure. The multi-layer transformer structure further includes an attention mechanism. After the first annotation sequence is processed by the attention mechanism, the input To the feedforward fully connected neural network structure for additional feature extraction, to obtain the second probability of each second label of each word in the target text sequence; that is, to obtain each of the target text sequences through the operation of wx+b The second probability of each second label of the word, where x is the sequence, and w and b are the model learning parameters.
步骤S104b,根据所述目标文本序列中每个字的各个第二标签的第二概率,生成第二标注序列。Step S104b: Generate a second tagging sequence according to the second probability of each second tag of each word in the target text sequence.
步骤S106,将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m)。 Step S106: Use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model.
具体的,所述步骤S106可以进一步包括:Specifically, the step S106 may further include:
步骤S106a,将所述第二标注序列输入到CRF模型中;Step S106a, input the second annotation sequence into the CRF model;
步骤S106b,通过维特比算法对所述第二标注序列进行维特比求解,以得到所述第二标注序列中的最优求解路径,其中,所述最优求解路径即为所述标签序列为整条目标文本序列的最高概率序列;Step S106b: Viterbi solving the second annotation sequence by the Viterbi algorithm to obtain an optimal solution path in the second annotation sequence, where the optimal solution path is that the label sequence is an integer. The highest probability sequence of the target text sequence;
示例性的,本步骤是根据所述目标文本序列中每个字的各个第二标签的第二概率的概率值,判断所述目标文本序列应该对应的输出对象;这里通过维特比算法实现,所述维特比算并非输出所述目标文本序列中每个字的各个第二标签的第二概率中的最高标签概率,而是将输出整个所述目标文本序列的最高概率标注序列。Exemplarily, this step is to determine the output object that the target text sequence should correspond to according to the probability value of the second probability of each second label of each word in the target text sequence; here it is implemented by the Viterbi algorithm, so The Viterbi algorithm does not output the highest label probability among the second probabilities of each second label of each word in the target text sequence, but outputs the highest probability label sequence of the entire target text sequence.
示例性的,其中,所述维特比算法可以包括:当所述目标文本序列中每个字的各个第二标签的第二概率最大的路径经过篱笆网络的某点时,则从开始点到该点的子路径也一定是从开始到该点路径中概率最大的;当第i时刻有k个状态,则从开始到i时刻的k个状态有k条最短路径,而最终的最短路径必然经过其中的一条。Exemplarily, the Viterbi algorithm may include: when the path with the second highest probability of each second tag of each word in the target text sequence passes through a certain point of the fence network, then from the starting point to the The sub-path of a point must also be the path with the greatest probability from the beginning to the point; when there are k states at the i-th moment, there are k shortest paths from the beginning to the k states at the i time, and the final shortest path must pass One of them.
步骤S106c,根据所述最优求解路径生成标签序列。Step S106c, generating a label sequence according to the optimal solution path.
示例性的,通过所述维特比算法计算所述整个所述目标文本序列的最高概率标注序列,在计算第i+1状态的最短路径时,只需要考虑从开始到当前的k个状态值的最短路径和当前状态值到第i+1状态值的最短路径即可。Exemplarily, the highest probability labeling sequence of the entire target text sequence is calculated by the Viterbi algorithm. When calculating the shortest path of the i+1th state, only the k state values from the beginning to the current The shortest path and the shortest path from the current state value to the i+1th state value are sufficient.
步骤S108,根据所述标签序列生成命名实体序列,并输出所述命名实体序列。Step S108: Generate a named entity sequence according to the tag sequence, and output the named entity sequence.
示例性的,根据所述标签序列即可生成命名实体序列,所述命名实体序列是通过标注系统预测的目标文本序列。其中,所述命名实体包括地名、人名等;序列标注采用BIOES形式,其中B为实体开端、I为实体中部、O为非实体、E为实体结尾、S为单字实体;而每一种命名实体标注又对应实体类别,可细化为B-地名:地名实体的开端等相似形式。这里以地名与人名为例;例如,现已语句“居里生于华沙”这一语句为例,这句话将被拆分为字序列。那么居字被标注为B人名,里字被标注为E人名,生字被标注为O,于字被标注为O,波字被标注为B地名,兰字被标注为E-地名。Exemplarily, a named entity sequence can be generated according to the tag sequence, and the named entity sequence is a target text sequence predicted by the tagging system. Wherein, the named entity includes place name, person name, etc.; sequence labeling adopts the form of BIOES, where B is the beginning of the entity, I is the middle of the entity, O is the non-entity, E is the end of the entity, and S is the single word entity; and each named entity The label corresponds to the entity category, which can be refined into similar forms such as B-place name: the beginning of the place name entity. Here is an example of place names and personal names; for example, the sentence "Curie was born in Warsaw" as an example, this sentence will be split into a sequence of words. Then the home character is marked as B name, the inner character is marked as E name, the new character is marked as O, the word Yu is marked as O, the wave character is marked as B place name, and the blue character is marked as E- place name.
实施例二Example two
图2为本申请序列标注系统实施例二的程序模块示意图。序列标注系统20可以包括或被分割成一个或多个程序模块,一个或者多个程序模块被存储于存储介质中,并由一个或多个处理器所执行,以完成本申请,并可实现上述序列标注方法。本申请实施例所称的程序模块是指能够完成特定功能的一系计算机可读指令段。以下描述将具体介绍本实施例各程序模块的功能:Fig. 2 is a schematic diagram of program modules of the second embodiment of the sequence labeling system of this application. The sequence labeling system 20 may include or be divided into one or more program modules. One or more program modules are stored in a storage medium and executed by one or more processors to complete the application and realize the above Sequence labeling method. The program module referred to in the embodiments of the present application refers to a series of computer-readable instruction segments that can complete specific functions. The following description will specifically introduce the functions of each program module in this embodiment:
接收文本模块200,用于接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量。The receiving text module 200 is configured to receive a target text sequence, and convert the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word.
示例性的,所述接收文本模块200还用于:将所述目标文本序列输入到嵌入层,通过所述嵌入层输出与所述目标文本序列对应的多个字向量,所述多个字向量中包括至少一个标点向量;将所述多个字向量输入到分割层,根据所述至少一个标点向量对所述多个字向量进行分割,得到n个字向量集合,所述n个字向量集合对应n个分割码;通过位置编码对每个分割码进行编码运算,确定每个分割码的位置信息编码,以得到所述目标文本序列中每个字的位置向量;及根据所述目标文本序列中每个字的字向量以及所述每个字的位置向量,生成所述目标文本序列的句子向量。Exemplarily, the receiving text module 200 is further configured to: input the target text sequence into an embedding layer, and output multiple word vectors corresponding to the target text sequence through the embedding layer, the multiple word vectors Including at least one punctuation vector; input the plurality of word vectors to the segmentation layer, and divide the plurality of word vectors according to the at least one punctuation vector to obtain n word vector sets, the n word vector sets Corresponding to n segmentation codes; perform an encoding operation on each segmentation code by position encoding, and determine the position information encoding of each segmentation code to obtain the position vector of each word in the target text sequence; and according to the target text sequence The word vector of each word in and the position vector of each word in the, generate the sentence vector of the target text sequence.
第一标注模块202,用于将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字的属于n个第一标签中各个第一标签的第一概率。The first labeling module 202 is configured to input the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word into the trained BERT model, and the output of the BERT model is related to the target text sequence. The corresponding first annotation sequence, wherein the first annotation sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents The first probability that the corresponding word belongs to each of the n first tags.
示例性的,所述第一标注模块202还用于:通过所述BERT模型对所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量进行特征提取,以得到所述目标文本序列中每个字的各个第一标签的第一概率;根据所述目标文本序列中每个字的各个第一标签的第一概率,生成第一标注序列。Exemplarily, the first tagging module 202 is further configured to: use the BERT model to perform feature extraction on the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word to obtain the target The first probability of each first label of each word in the text sequence; and the first labeling sequence is generated according to the first probability of each first label of each word in the target text sequence.
第二标注模块204,用于将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中,所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率。The second labeling module 204 is configured to input the first labeling sequence to the fully connected layer, and output a second labeling sequence through the fully connected layer, wherein the second labeling sequence includes a plurality of second n-dimensional vectors, Each second n-dimensional vector corresponds to a word in the target text sequence, and the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags.
示例性的,所述第二标注模块204还用于:将所述第一标注序列输入到全连接层的神经网络结构中,进行额外特征提取,以得到所述目标文本序列中每个字的各个第二标签的第二概率,对于所述目标文本序列中第i个字的额外特征提取的运算公式为B i=wX i+b,其中,X i是所述第一标注序列中第i个字的各个第一标签的第一概率,w与b是BERT模型学习参数;根据所述目标文本序列中每个字的各个第二标签的第二概率,生成第二标注序列。 Exemplarily, the second labeling module 204 is further configured to: input the first labeling sequence into the neural network structure of the fully connected layer, and perform additional feature extraction to obtain the information of each word in the target text sequence. For the second probability of each second label, the calculation formula for the extra feature extraction of the i-th character in the target text sequence is B i =wX i +b, where Xi is the i-th character in the first labeling sequence The first probability of each first label of each word, w and b are BERT model learning parameters; according to the second probability of each second label of each word in the target text sequence, a second labeling sequence is generated.
输出标签模块206,用于将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m)。 The output label module 206 is configured to use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model.
示例性的,所述输出标签模块206还用于:将所述第二标注序列输入到CRF模型中;通过维特比算法对所述第二标注序列进行维特比求解,以得到所述第二标注序列中的最优求解路径,其中,所述最优求解路径即为所述标签序列为整条目标文本序列的最高概率序列;根据所述最优求解路径生成标签序列。Exemplarily, the output tag module 206 is further configured to: input the second annotation sequence into the CRF model; perform Viterbi solution on the second annotation sequence by the Viterbi algorithm to obtain the second annotation The optimal solution path in the sequence, wherein the optimal solution path is the highest probability sequence in which the label sequence is the entire target text sequence; the label sequence is generated according to the optimal solution path.
输出实体模块208,用于根据所述标签序列生成命名实体序列,并输出所述命名实体序列。The output entity module 208 is configured to generate a named entity sequence according to the tag sequence, and output the named entity sequence.
实施例三Example three
参阅图3,是本申请实施例三之计算机设备的硬件架构示意图。本实施例中,所述计算机设备2是一种能够按照事先设定或者存储的指令,自动进行数值计算和/或信息处理的设备。该计算机设备2可以是机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个服务器所组成的服务器集群)等。如图所示,所述计算机设备2至少包括,但不限于,可通过系统总线相互通信连接存储器21、处理器22、网络接口23、以及序列标注系统20。Refer to FIG. 3, which is a schematic diagram of the hardware architecture of the computer device in the third embodiment of the present application. In this embodiment, the computer device 2 is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. The computer device 2 may be a rack server, a blade server, a tower server, or a cabinet server (including an independent server or a server cluster composed of multiple servers). As shown in the figure, the computer device 2 at least includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and a sequence labeling system 20 that can communicate with each other through a system bus.
本实施例中,存储器21至少包括一种类型的非易失性计算机可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器21可以是计算机设备2的内部存储单元,例如该计算机设备2的硬盘或内存。在另一些实施例中,存储器21也可以是计算机设备2的外部存储设备,例如该计算机设备2上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,存储器21还可以既包括计算机设备2的内部存储单元也包括其外部存储设备。本实施例中,存储器21通常用于存储安装于计算机设备2的操作系统和各类应用软件,例如实施例二的序列标注系统20的程序代码等。此外,存储器21还可以用于暂时地存储已经输出或者将要输出的各类数据。In this embodiment, the memory 21 includes at least one type of non-volatile computer-readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), Random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk Wait. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, for example, a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Of course, the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device. In this embodiment, the memory 21 is generally used to store an operating system and various application software installed in the computer device 2, for example, the program code of the sequence labeling system 20 in the second embodiment. In addition, the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
处理器22在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器22通常用于控制计算机设备2的总体操作。本实施例中,处理器22用于运行存储器21中存储的程序代码或者处理数据,例如运行序列标注系统20,以实现实施例一的序列标注方法。The processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 22 is generally used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the sequence labeling system 20 to implement the sequence labeling method of the first embodiment.
所述网络接口23可包括无线网络接口或有线网络接口,该网络接口23通常用于在所述计算机设备2与其他电子装置之间建立通信连接。例如,所述网络接口23用于通过网络 将所述计算机设备2与外部终端相连,在所述计算机设备2与外部终端之间的建立数据传输通道和通信连接等。所述网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi等无线或有线网络。The network interface 23 may include a wireless network interface or a wired network interface, and the network interface 23 is generally used to establish a communication connection between the computer device 2 and other electronic devices. For example, the network interface 23 is used to connect the computer device 2 to an external terminal through a network, and to establish a data transmission channel and a communication connection between the computer device 2 and the external terminal. The network may be an intranet, the Internet, a global system of mobile communication (GSM), a wideband code division multiple access (WCDMA), a 4G network, and a 5G Network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
需要指出的是,图3仅示出了具有部件20-23的计算机设备2,但是应理解的是,并不要求实施所有示出的部件,可以替代的实施更多或者更少的部件。It should be pointed out that FIG. 3 only shows the computer device 2 with components 20-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
在本实施例中,存储于存储器21中的序列标注系统20还可以被分割为一个或者多个程序模块,所述一个或者多个程序模块被存储于存储器21中,并由一个或多个处理器(本实施例为处理器22)所执行,以完成本申请。In this embodiment, the sequence labeling system 20 stored in the memory 21 may also be divided into one or more program modules, and the one or more program modules are stored in the memory 21 and processed by one or more The processor (in this embodiment, the processor 22) is executed to complete the application.
例如,图2示出了本申请实施例二之所述实现序列标注系统20的程序模块示意图,该实施例中,所述序列标注系统20可以被划分为接收文本模块200、第一标注模块202、第二标注模块204、输出标签模块206和输出实体模块208。其中,本申请所称的程序模块是指能够完成特定功能的一系列计算机可读指令段。所述程序模块200-208的具体功能在实施例二中已有详细描述,在此不再赘述。For example, FIG. 2 shows a schematic diagram of program modules for implementing the sequence labeling system 20 according to the second embodiment of the present application. In this embodiment, the sequence labeling system 20 can be divided into a text receiving module 200 and a first labeling module 202. , The second labeling module 204, the output label module 206, and the output entity module 208. Among them, the program module referred to in this application refers to a series of computer-readable instruction segments that can complete specific functions. The specific functions of the program modules 200-208 have been described in detail in the second embodiment, and will not be repeated here.
实施例四Example four
本实施例还提供一种非易失性计算机可读存储介质,如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机可读指令,程序被处理器执行时实现相应功能。本实施例的非易失性计算机可读存储介质用于序列标注系统20,被处理器执行如下步骤:This embodiment also provides a non-volatile computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory ( SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, servers, App application malls, etc., on which storage There are computer-readable instructions, and the corresponding functions are realized when the program is executed by the processor. The non-volatile computer-readable storage medium of this embodiment is used in the sequence labeling system 20, and the processor executes the following steps:
接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量;Receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word;
将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字的属于n个第一标签中各个第一标签的第一概率;Input the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word into the trained BERT model, and output the first annotation sequence corresponding to the target text sequence through the BERT model, where , The first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding word belonging to n first The first probability of each first tag in the tags;
将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中, 所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率;The first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, where the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m);及 Use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model; and
根据所述标签序列生成命名实体序列,并输出所述命名实体序列。A named entity sequence is generated according to the tag sequence, and the named entity sequence is output.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the superiority or inferiority of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims (20)

  1. 一种序列标注方法,所述方法包括:A sequence labeling method, the method includes:
    接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量;Receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word;
    将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字属于n个第一标签中各个第一标签的第一概率;Input the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word into the trained BERT model, and output the first annotation sequence corresponding to the target text sequence through the BERT model, where , The first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector indicates that the corresponding word belongs to n first tags The first probability of each first label in;
    将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中,所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率;The first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, wherein the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
    将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m);及 Use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model; and
    根据所述标签序列生成命名实体序列,并输出所述命名实体序列。A named entity sequence is generated according to the tag sequence, and the named entity sequence is output.
  2. 如权利要求1所述的序列标注方法,将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量的步骤,包括:The sequence labeling method according to claim 1, wherein the step of converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word comprises:
    将所述目标文本序列输入到嵌入层,通过所述嵌入层输出与所述目标文本序列对应的多个字向量,所述多个字向量中包括至少一个标点向量;Inputting the target text sequence to an embedding layer, and outputting a plurality of word vectors corresponding to the target text sequence through the embedding layer, and the plurality of word vectors includes at least one punctuation vector;
    将所述多个字向量输入到分割层,根据所述至少一个标点向量对所述多个字向量进行分割,得到n个字向量集合,所述n个字向量集合对应n个分割码;Inputting the plurality of word vectors into the segmentation layer, and segmenting the plurality of word vectors according to the at least one punctuation vector to obtain n word vector sets, the n word vector sets corresponding to n segmentation codes;
    通过位置编码对每个分割码进行编码运算,确定每个分割码的位置信息编码,以得到所述目标文本序列中每个字的位置向量;及Perform an encoding operation on each segmentation code through position encoding, and determine the position information encoding of each segmentation code, so as to obtain the position vector of each word in the target text sequence; and
    根据所述目标文本序列中每个字的字向量以及所述每个字的位置向量,生成所述目标文本序列的句子向量。According to the word vector of each word in the target text sequence and the position vector of each word, a sentence vector of the target text sequence is generated.
  3. 如权利要求2所述的序列标注方法,所述通过所述BERT模型输出与所述目标文本序列对应的第一标注序列的步骤,包括:3. The sequence labeling method according to claim 2, wherein the step of outputting the first labeling sequence corresponding to the target text sequence through the BERT model comprises:
    通过所述BERT模型对所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量进行特征提取,以得到所述目标文本序列中每个字的各个第一标签的第一概率;Perform feature extraction on the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word through the BERT model to obtain the first probability of each first label of each word in the target text sequence ;
    根据所述目标文本序列中每个字的各个第一标签的第一概率,生成第一标注序列。According to the first probability of each first label of each word in the target text sequence, a first labeling sequence is generated.
  4. 如权利要求3所述的序列标注方法,所述将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列的步骤,包括:The sequence labeling method according to claim 3, wherein the step of inputting the first labeling sequence to a fully connected layer and outputting a second labeling sequence through the fully connected layer comprises:
    将所述第一标注序列输入到全连接层的神经网络结构中,进行额外特征提取,以得到所述目标文本序列中每个字的各个第二标签的第二概率,对于所述目标文本序列中第i个字的额外特征提取的运算公式为B i=wX i+b,其中,X i是所述第一标注序列中第i个字的各个第一标签的第一概率,w与b是BERT模型学习参数; The first annotation sequence is input into the neural network structure of the fully connected layer, and additional feature extraction is performed to obtain the second probability of each second label of each word in the target text sequence. For the target text sequence The calculation formula for the extra feature extraction of the i-th word in is B i =wX i +b, where X i is the first probability of each first tag of the i-th word in the first labeling sequence, w and b Are the learning parameters of the BERT model;
    根据所述目标文本序列中每个字的各个第二标签的第二概率,生成第二标注序列。According to the second probability of each second label of each word in the target text sequence, a second labeling sequence is generated.
  5. 如权利要求1所述的序列标注方法,所述将所述第二将标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m)的步骤,包括: The sequence labeling method according to claim 1, wherein the second labeling sequence is used as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 ,y 2 ,... ,y m ), including:
    将所述第二标注序列输入到CRF模型中;Input the second annotation sequence into the CRF model;
    通过维特比算法对所述第二标注序列进行维特比求解,以得到所述第二标注序列中的最优求解路径,其中,所述最优求解路径即为所述标签序列为整条目标文本序列的最高概率序列;Viterbi solution is performed on the second tag sequence by the Viterbi algorithm to obtain the optimal solution path in the second tag sequence, where the optimal solution path is that the tag sequence is the entire target text The highest probability sequence of the sequence;
    根据所述最优求解路径生成标签序列。A label sequence is generated according to the optimal solution path.
  6. 一种序列标注系统,包括:A sequence labeling system, including:
    接收文本模块,用于接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量;A receiving text module for receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word;
    第一标注模块,用于将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字的属于n个第一标签中各个第一标签的第一概率;The first labeling module is used to input the sentence vector of the target text sequence, the word vector of each word and the position vector of each word into the trained BERT model, and the output of the BERT model corresponds to the target text sequence The first tagging sequence, wherein the first tagging sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector represents the corresponding The first probability of the word belonging to each of the n first tags;
    第二标注模块,用于将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中,所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率;The second labeling module is configured to input the first labeling sequence to the fully connected layer, and output a second labeling sequence through the fully connected layer, wherein the second labeling sequence includes a plurality of second n-dimensional vectors, each A second n-dimensional vector corresponds to a word in the target text sequence, and the second n-dimensional vector represents a second probability of the corresponding word belonging to each of the n second tags;
    输出标签模块,用于将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m);及 The output label module is used to use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model; and
    输出实体模块,用于根据所述标签序列生成命名实体序列,并输出所述命名实体序列。The output entity module is used to generate a named entity sequence according to the tag sequence, and output the named entity sequence.
  7. 如权利要求6所述的序列标注系统,所述接收文本模块还用于:The sequence labeling system according to claim 6, wherein the receiving text module is further configured to:
    将所述目标文本序列输入到嵌入层,通过所述嵌入层输出与所述目标文本序列对应的多个字向量,所述多个字向量中包括至少一个标点向量;Inputting the target text sequence to an embedding layer, and outputting a plurality of word vectors corresponding to the target text sequence through the embedding layer, and the plurality of word vectors includes at least one punctuation vector;
    将所述多个字向量输入到分割层,根据所述至少一个标点向量对所述多个字向量进行分割,得到n个字向量集合,所述n个字向量集合对应n个分割码;及Inputting the plurality of word vectors into the segmentation layer, and segmenting the plurality of word vectors according to the at least one punctuation vector to obtain n word vector sets, the n word vector sets corresponding to n segmentation codes; and
    通过位置编码对每个分割码进行编码运算,确定每个分割码的位置信息编码,以得到所述目标文本序列中每个字的位置向量;Perform an encoding operation on each segmentation code by position encoding, and determine the position information encoding of each segmentation code, so as to obtain the position vector of each word in the target text sequence;
    根据所述目标文本序列中每个字的字向量以及所述每个字的位置向量,生成所述目标文本序列的句子向量。According to the word vector of each word in the target text sequence and the position vector of each word, a sentence vector of the target text sequence is generated.
  8. 如权利要求7所述的序列标注系统,所述第一标注模块还用于:The sequence labeling system according to claim 7, wherein the first labeling module is further configured to:
    通过所述BERT模型对所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量进行特征提取,以得到所述目标文本序列中每个字的各个第一标签的第一概率;Perform feature extraction on the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word through the BERT model to obtain the first probability of each first label of each word in the target text sequence ;
    根据所述目标文本序列中每个字的各个第一标签的第一概率,生成第一标注序列。According to the first probability of each first label of each word in the target text sequence, a first labeling sequence is generated.
  9. 如权利要求8所述的序列标注系统,所述第二标注模块还用于:The sequence labeling system according to claim 8, wherein the second labeling module is further configured to:
    将所述第一标注序列输入到全连接层的神经网络结构中,进行额外特征提取,以得到所述目标文本序列中每个字的各个第二标签的第二概率,对于所述目标文本序列中第i个字的额外特征提取的运算公式为B i=wX i+b,其中,X i是所述第一标注序列中第i个字的各个第一标签的第一概率,w与b是BERT模型学习参数; The first annotation sequence is input into the neural network structure of the fully connected layer, and additional feature extraction is performed to obtain the second probability of each second label of each word in the target text sequence. For the target text sequence The calculation formula for the extra feature extraction of the i-th word in is B i =wX i +b, where X i is the first probability of each first tag of the i-th word in the first labeling sequence, w and b Are the learning parameters of the BERT model;
    根据所述目标文本序列中每个字的各个第二标签的第二概率,生成第二标注序列。According to the second probability of each second label of each word in the target text sequence, a second labeling sequence is generated.
  10. 如权利要求6所述的序列标注系统,所述输出标签模块还用于:The sequence labeling system according to claim 6, wherein the output label module is further used for:
    将所述第二标注序列输入到CRF模型中;Input the second annotation sequence into the CRF model;
    通过维特比算法对所述第二标注序列进行维特比求解,以得到所述第二标注序列中的最优求解路径,其中,所述最优求解路径即为所述标签序列为整条目标文本序列的最高概率序列;Viterbi solution is performed on the second tag sequence by the Viterbi algorithm to obtain the optimal solution path in the second tag sequence, where the optimal solution path is that the tag sequence is the entire target text The highest probability sequence of the sequence;
    根据所述最优求解路径生成标签序列。A label sequence is generated according to the optimal solution path.
  11. 一种计算机设备,所述计算机设备包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:A computer device that includes a memory, a processor, and computer-readable instructions stored on the memory and capable of running on the processor. The computer-readable instructions implement the following steps when executed by the processor :
    接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量;Receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word;
    将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其 中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字属于n个第一标签中各个第一标签的第一概率;Input the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word into the trained BERT model, and output the first annotation sequence corresponding to the target text sequence through the BERT model, where , The first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector indicates that the corresponding word belongs to n first tags The first probability of each first label in;
    将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中,所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率;The first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, wherein the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
    将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m);及 Use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model; and
    根据所述标签序列生成命名实体序列,并输出所述命名实体序列。A named entity sequence is generated according to the tag sequence, and the named entity sequence is output.
  12. 如权利要求11所述的计算机设备,将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量的步骤,包括:11. The computer device of claim 11, wherein the step of converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word comprises:
    将所述目标文本序列输入到嵌入层,通过所述嵌入层输出与所述目标文本序列对应的多个字向量,所述多个字向量中包括至少一个标点向量;Inputting the target text sequence to an embedding layer, and outputting a plurality of word vectors corresponding to the target text sequence through the embedding layer, and the plurality of word vectors includes at least one punctuation vector;
    将所述多个字向量输入到分割层,根据所述至少一个标点向量对所述多个字向量进行分割,得到n个字向量集合,所述n个字向量集合对应n个分割码;Inputting the plurality of word vectors into the segmentation layer, and segmenting the plurality of word vectors according to the at least one punctuation vector to obtain n word vector sets, the n word vector sets corresponding to n segmentation codes;
    通过位置编码对每个分割码进行编码运算,确定每个分割码的位置信息编码,以得到所述目标文本序列中每个字的位置向量;及Perform an encoding operation on each segmentation code through position encoding, and determine the position information encoding of each segmentation code, so as to obtain the position vector of each word in the target text sequence; and
    根据所述目标文本序列中每个字的字向量以及所述每个字的位置向量,生成所述目标文本序列的句子向量。According to the word vector of each word in the target text sequence and the position vector of each word, a sentence vector of the target text sequence is generated.
  13. 如权利要求12所述的计算机设备,所述通过所述BERT模型输出与所述目标文本序列对应的第一标注序列的步骤,包括:The computer device according to claim 12, wherein the step of outputting the first annotation sequence corresponding to the target text sequence through the BERT model comprises:
    通过所述BERT模型对所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量进行特征提取,以得到所述目标文本序列中每个字的各个第一标签的第一概率;Perform feature extraction on the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word through the BERT model to obtain the first probability of each first label of each word in the target text sequence ;
    根据所述目标文本序列中每个字的各个第一标签的第一概率,生成第一标注序列。According to the first probability of each first label of each word in the target text sequence, a first labeling sequence is generated.
  14. 如权利要求13所述的计算机设备,所述将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列的步骤,包括:The computer device according to claim 13, wherein the step of inputting the first labeling sequence to a fully connected layer and outputting a second labeling sequence through the fully connected layer comprises:
    将所述第一标注序列输入到全连接层的神经网络结构中,进行额外特征提取,以得到所述目标文本序列中每个字的各个第二标签的第二概率,对于所述目标文本序列中第i个字的额外特征提取的运算公式为B i=wX i+b,其中,X i是所述第一标注序列中第i个字的各个第一标签的第一概率,w与b是BERT模型学习参数; The first annotation sequence is input into the neural network structure of the fully connected layer, and additional feature extraction is performed to obtain the second probability of each second label of each word in the target text sequence. For the target text sequence The calculation formula for the extra feature extraction of the i-th word in is B i =wX i +b, where X i is the first probability of each first tag of the i-th word in the first labeling sequence, w and b Are the learning parameters of the BERT model;
    根据所述目标文本序列中每个字的各个第二标签的第二概率,生成第二标注序列。According to the second probability of each second label of each word in the target text sequence, a second labeling sequence is generated.
  15. 如权利要求11所述的计算机设备,所述将所述第二将标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m)的步骤,包括: The computer device according to claim 11, wherein the second labeling sequence is used as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 ,y 2 ,..., The steps of y m) include:
    将所述第二标注序列输入到CRF模型中;Input the second annotation sequence into the CRF model;
    通过维特比算法对所述第二标注序列进行维特比求解,以得到所述第二标注序列中的最优求解路径,其中,所述最优求解路径即为所述标签序列为整条目标文本序列的最高概率序列;Viterbi solution is performed on the second tag sequence by the Viterbi algorithm to obtain the optimal solution path in the second tag sequence, where the optimal solution path is that the tag sequence is the entire target text The highest probability sequence of the sequence;
    根据所述最优求解路径生成标签序列。A label sequence is generated according to the optimal solution path.
  16. 一种非易失性计算机可读存储介质,所述非易失性计算机可读存储介质内存储有计算机可读指令,所述计算机可读指令可被至少一个处理器所执行,以使所述至少一个处理器执行如下步骤:A non-volatile computer-readable storage medium having computer-readable instructions stored in the non-volatile computer-readable storage medium, and the computer-readable instructions can be executed by at least one processor to cause the At least one processor performs the following steps:
    接收目标文本序列,并将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量;Receiving a target text sequence, and converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word;
    将所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量输入到训练后的BERT模型中,通过所述BERT模型输出与所述目标文本序列对应的第一标注序列,其中,所述第一标注序列包括多个第一n维向量,每个第一n维向量对应所述目标文本序列中的一个字,所述第一n维向量表示对应字属于n个第一标签中各个第一标签的第一概率;Input the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word into the trained BERT model, and output the first annotation sequence corresponding to the target text sequence through the BERT model, where , The first tag sequence includes a plurality of first n-dimensional vectors, each first n-dimensional vector corresponds to a word in the target text sequence, and the first n-dimensional vector indicates that the corresponding word belongs to n first tags The first probability of each first label in;
    将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列,其中,所述第二标注序列包括多个第二n维向量,每个第二n维向量对应所述目标文本序列中的一个字,所述第二n维向量表示对应字的属于n个第二标签中各个第二标签的第二概率;The first annotation sequence is input to the fully connected layer, and the second annotation sequence is output through the fully connected layer, wherein the second annotation sequence includes a plurality of second n-dimensional vectors, and each second n-dimensional vector corresponds to For a word in the target text sequence, the second n-dimensional vector represents the second probability of the corresponding word belonging to each of the n second tags;
    将所述第二标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m);及 Use the second label sequence as the input sequence of the conditional random field CRF model to output the label sequence Y=(y 1 , y 2 ,..., y m ) through the CRF model; and
    根据所述标签序列生成命名实体序列,并输出所述命名实体序列。A named entity sequence is generated according to the tag sequence, and the named entity sequence is output.
  17. 如权利要求16所述的非易失性计算机可读存储介质,将所述目标文本序列转换为相应的句子向量、各个字的字向量和各个字的位置向量的步骤,包括:16. The non-volatile computer-readable storage medium of claim 16, wherein the step of converting the target text sequence into a corresponding sentence vector, a word vector of each word, and a position vector of each word comprises:
    将所述目标文本序列输入到嵌入层,通过所述嵌入层输出与所述目标文本序列对应的多个字向量,所述多个字向量中包括至少一个标点向量;Inputting the target text sequence to an embedding layer, and outputting a plurality of word vectors corresponding to the target text sequence through the embedding layer, and the plurality of word vectors includes at least one punctuation vector;
    将所述多个字向量输入到分割层,根据所述至少一个标点向量对所述多个字向量进行分割,得到n个字向量集合,所述n个字向量集合对应n个分割码;Inputting the plurality of word vectors into the segmentation layer, and segmenting the plurality of word vectors according to the at least one punctuation vector to obtain n word vector sets, the n word vector sets corresponding to n segmentation codes;
    通过位置编码对每个分割码进行编码运算,确定每个分割码的位置信息编码,以得到所述目标文本序列中每个字的位置向量;及Perform an encoding operation on each segmentation code through position encoding, and determine the position information encoding of each segmentation code, so as to obtain the position vector of each word in the target text sequence; and
    根据所述目标文本序列中每个字的字向量以及所述每个字的位置向量,生成所述目标文本序列的句子向量。According to the word vector of each word in the target text sequence and the position vector of each word, a sentence vector of the target text sequence is generated.
  18. 如权利要求17所述的非易失性计算机可读存储介质,所述通过所述BERT模型输出与所述目标文本序列对应的第一标注序列的步骤,包括:17. The non-volatile computer-readable storage medium according to claim 17, wherein the step of outputting the first annotation sequence corresponding to the target text sequence through the BERT model comprises:
    通过所述BERT模型对所述目标文本序列的句子向量、各个字的字向量和各个字的位置向量进行特征提取,以得到所述目标文本序列中每个字的各个第一标签的第一概率;Perform feature extraction on the sentence vector of the target text sequence, the word vector of each word, and the position vector of each word through the BERT model to obtain the first probability of each first label of each word in the target text sequence ;
    根据所述目标文本序列中每个字的各个第一标签的第一概率,生成第一标注序列。According to the first probability of each first label of each word in the target text sequence, a first labeling sequence is generated.
  19. 如权利要求18所述的非易失性计算机可读存储介质,所述将所述第一标注序列输入到全连接层,通过所述全连接层输出第二标注序列的步骤,包括:The non-volatile computer-readable storage medium of claim 18, wherein the step of inputting the first labeling sequence to a fully connected layer and outputting a second labeling sequence through the fully connected layer comprises:
    将所述第一标注序列输入到全连接层的神经网络结构中,进行额外特征提取,以得到所述目标文本序列中每个字的各个第二标签的第二概率,对于所述目标文本序列中第i个字的额外特征提取的运算公式为B i=wX i+b,其中,X i是所述第一标注序列中第i个字的各个第一标签的第一概率,w与b是BERT模型学习参数; The first annotation sequence is input into the neural network structure of the fully connected layer, and additional feature extraction is performed to obtain the second probability of each second label of each word in the target text sequence. For the target text sequence additional features of the i-th word extracted computation formula B i = wX i + b, where, X i is the probability of each of the first tag of the first sequence of the first tagging word i, w and b Are the learning parameters of the BERT model;
    根据所述目标文本序列中每个字的各个第二标签的第二概率,生成第二标注序列。According to the second probability of each second label of each word in the target text sequence, a second labeling sequence is generated.
  20. 如权利要求16所述的非易失性计算机可读存储介质,所述将所述第二将标注序列作为条件随机场CRF模型的输入序列,以通过CRF模型输出标签序列Y=(y 1,y 2,...,y m)的步骤,包括: The non-volatile computer-readable storage medium according to claim 16, wherein the second labeling sequence is used as the input sequence of the conditional random field CRF model to output the labeling sequence Y=(y 1 , The steps of y 2 ,...,y m ) include:
    将所述第二标注序列输入到CRF模型中;Input the second annotation sequence into the CRF model;
    通过维特比算法对所述第二标注序列进行维特比求解,以得到所述第二标注序列中的最优求解路径,其中,所述最优求解路径即为所述标签序列为整条目标文本序列的最高概率序列;Viterbi solution is performed on the second tag sequence by the Viterbi algorithm to obtain the optimal solution path in the second tag sequence, where the optimal solution path is that the tag sequence is the entire target text The highest probability sequence of the sequence;
    根据所述最优求解路径生成标签序列。A label sequence is generated according to the optimal solution path.
PCT/CN2019/117403 2019-10-16 2019-11-12 Sequence labeling method and system, and computer device WO2021072852A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910983279.2A CN111222317B (en) 2019-10-16 2019-10-16 Sequence labeling method, system and computer equipment
CN201910983279.2 2019-10-16

Publications (1)

Publication Number Publication Date
WO2021072852A1 true WO2021072852A1 (en) 2021-04-22

Family

ID=70827510

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117403 WO2021072852A1 (en) 2019-10-16 2019-11-12 Sequence labeling method and system, and computer device

Country Status (2)

Country Link
CN (1) CN111222317B (en)
WO (1) WO2021072852A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392659A (en) * 2021-06-25 2021-09-14 携程旅游信息技术(上海)有限公司 Machine translation method, device, electronic equipment and storage medium
CN113516196A (en) * 2021-07-20 2021-10-19 云知声智能科技股份有限公司 Method, device, electronic equipment and medium for named entity identification data enhancement
CN113537346A (en) * 2021-07-15 2021-10-22 思必驰科技股份有限公司 Medical field data labeling model training method and medical field data labeling method
CN113569574A (en) * 2021-07-16 2021-10-29 阳光电源股份有限公司 Work order type identification method, terminal and storage medium
CN113626608A (en) * 2021-10-12 2021-11-09 深圳前海环融联易信息科技服务有限公司 Semantic-enhancement relationship extraction method and device, computer equipment and storage medium
CN113673247A (en) * 2021-05-13 2021-11-19 江苏曼荼罗软件股份有限公司 Entity identification method, device, medium and electronic equipment based on deep learning
CN114048288A (en) * 2021-11-10 2022-02-15 北京明略软件系统有限公司 Fine-grained emotion analysis method and system, computer equipment and storage medium
CN114386419A (en) * 2022-01-11 2022-04-22 平安科技(深圳)有限公司 Entity recognition model training method, device, equipment and storage medium
CN114580424A (en) * 2022-04-24 2022-06-03 之江实验室 Labeling method and device for named entity identification of legal document
CN115879473A (en) * 2022-12-26 2023-03-31 淮阴工学院 Chinese medical named entity recognition method based on improved graph attention network

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859858B (en) * 2020-07-22 2024-03-01 智者四海(北京)技术有限公司 Method and device for extracting relation from text
CN112148856B (en) * 2020-09-22 2024-01-23 北京百度网讯科技有限公司 Method and device for establishing punctuation prediction model
CN112541341A (en) * 2020-12-18 2021-03-23 广东电网有限责任公司 Text event element extraction method
CN113064992A (en) * 2021-03-22 2021-07-02 平安银行股份有限公司 Complaint work order structured processing method, device, equipment and storage medium
CN113157883A (en) * 2021-04-07 2021-07-23 浙江工贸职业技术学院 Chinese opinion target boundary prediction method based on dual-model structure
CN113553824A (en) * 2021-07-07 2021-10-26 临沂中科好孕智能技术有限公司 Sentence vector model training method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170206897A1 (en) * 2016-01-18 2017-07-20 Alibaba Group Holding Limited Analyzing textual data
CN109815952A (en) * 2019-01-24 2019-05-28 珠海市筑巢科技有限公司 Brand name recognition methods, computer installation and computer readable storage medium
CN109994201A (en) * 2019-03-18 2019-07-09 浙江大学 A kind of diabetes based on deep learning and hypertension method for calculating probability
CN110083831A (en) * 2019-04-16 2019-08-02 武汉大学 A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF
CN110147452A (en) * 2019-05-17 2019-08-20 北京理工大学 A kind of coarseness sentiment analysis method based on level BERT neural network
CN110297913A (en) * 2019-06-12 2019-10-01 中电科大数据研究院有限公司 A kind of electronic government documents entity abstracting method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201614958D0 (en) * 2016-09-02 2016-10-19 Digital Genius Ltd Message text labelling
CN109635279B (en) * 2018-11-22 2022-07-26 桂林电子科技大学 Chinese named entity recognition method based on neural network
CN110223742A (en) * 2019-06-14 2019-09-10 中南大学 The clinical manifestation information extraction method and equipment of Chinese electronic health record data

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170206897A1 (en) * 2016-01-18 2017-07-20 Alibaba Group Holding Limited Analyzing textual data
CN109815952A (en) * 2019-01-24 2019-05-28 珠海市筑巢科技有限公司 Brand name recognition methods, computer installation and computer readable storage medium
CN109994201A (en) * 2019-03-18 2019-07-09 浙江大学 A kind of diabetes based on deep learning and hypertension method for calculating probability
CN110083831A (en) * 2019-04-16 2019-08-02 武汉大学 A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF
CN110147452A (en) * 2019-05-17 2019-08-20 北京理工大学 A kind of coarseness sentiment analysis method based on level BERT neural network
CN110297913A (en) * 2019-06-12 2019-10-01 中电科大数据研究院有限公司 A kind of electronic government documents entity abstracting method

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673247A (en) * 2021-05-13 2021-11-19 江苏曼荼罗软件股份有限公司 Entity identification method, device, medium and electronic equipment based on deep learning
CN113392659A (en) * 2021-06-25 2021-09-14 携程旅游信息技术(上海)有限公司 Machine translation method, device, electronic equipment and storage medium
CN113537346B (en) * 2021-07-15 2023-08-15 思必驰科技股份有限公司 Medical field data labeling model training method and medical field data labeling method
CN113537346A (en) * 2021-07-15 2021-10-22 思必驰科技股份有限公司 Medical field data labeling model training method and medical field data labeling method
CN113569574A (en) * 2021-07-16 2021-10-29 阳光电源股份有限公司 Work order type identification method, terminal and storage medium
CN113569574B (en) * 2021-07-16 2024-02-09 阳光电源股份有限公司 Method, terminal and storage medium for identifying work order type
CN113516196A (en) * 2021-07-20 2021-10-19 云知声智能科技股份有限公司 Method, device, electronic equipment and medium for named entity identification data enhancement
CN113516196B (en) * 2021-07-20 2024-04-12 云知声智能科技股份有限公司 Named entity recognition data enhancement method, named entity recognition data enhancement device, electronic equipment and named entity recognition data enhancement medium
CN113626608A (en) * 2021-10-12 2021-11-09 深圳前海环融联易信息科技服务有限公司 Semantic-enhancement relationship extraction method and device, computer equipment and storage medium
CN114048288A (en) * 2021-11-10 2022-02-15 北京明略软件系统有限公司 Fine-grained emotion analysis method and system, computer equipment and storage medium
CN114386419B (en) * 2022-01-11 2023-07-25 平安科技(深圳)有限公司 Entity recognition model training method, device, equipment and storage medium
CN114386419A (en) * 2022-01-11 2022-04-22 平安科技(深圳)有限公司 Entity recognition model training method, device, equipment and storage medium
CN114580424B (en) * 2022-04-24 2022-08-05 之江实验室 Labeling method and device for named entity identification of legal document
CN114580424A (en) * 2022-04-24 2022-06-03 之江实验室 Labeling method and device for named entity identification of legal document
CN115879473A (en) * 2022-12-26 2023-03-31 淮阴工学院 Chinese medical named entity recognition method based on improved graph attention network
CN115879473B (en) * 2022-12-26 2023-12-01 淮阴工学院 Chinese medical named entity recognition method based on improved graph attention network

Also Published As

Publication number Publication date
CN111222317A (en) 2020-06-02
CN111222317B (en) 2022-04-29

Similar Documents

Publication Publication Date Title
WO2021072852A1 (en) Sequence labeling method and system, and computer device
US11574122B2 (en) Method and system for joint named entity recognition and relation extraction using convolutional neural network
US11288593B2 (en) Method, apparatus and device for extracting information
US11651163B2 (en) Multi-turn dialogue response generation with persona modeling
US20220318505A1 (en) Inducing rich interaction structures between words for document-level event argument extraction
CN111931517B (en) Text translation method, device, electronic equipment and storage medium
US20190073351A1 (en) Generating dependency parses of text segments using neural networks
CN112101041B (en) Entity relationship extraction method, device, equipment and medium based on semantic similarity
US11893060B2 (en) Latent question reformulation and information accumulation for multi-hop machine reading
CN111832318B (en) Single sentence natural language processing method and device, computer equipment and readable storage medium
CN113204611A (en) Method for establishing reading understanding model, reading understanding method and corresponding device
CN113626608B (en) Semantic-enhancement relationship extraction method and device, computer equipment and storage medium
CN115146068A (en) Method, device and equipment for extracting relation triples and storage medium
CN112599211B (en) Medical entity relationship extraction method and device
CN112613322B (en) Text processing method, device, equipment and storage medium
CN113743101A (en) Text error correction method and device, electronic equipment and computer storage medium
CN111241843B (en) Semantic relation inference system and method based on composite neural network
CN112052329A (en) Text abstract generation method and device, computer equipment and readable storage medium
US20230070966A1 (en) Method for processing question, electronic device and storage medium
US20220179889A1 (en) Method for generating query statement, electronic device and storage medium
CN113704466B (en) Text multi-label classification method and device based on iterative network and electronic equipment
CN114220505A (en) Information extraction method of medical record data, terminal equipment and readable storage medium
CN114491030A (en) Skill label extraction and candidate phrase classification model training method and device
CN111967253A (en) Entity disambiguation method and device, computer equipment and storage medium
CN113515931B (en) Text error correction method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19948946

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19948946

Country of ref document: EP

Kind code of ref document: A1