WO2023060633A1 - 增强语义的关系抽取方法、装置、计算机设备及存储介质 - Google Patents

增强语义的关系抽取方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2023060633A1
WO2023060633A1 PCT/CN2021/124642 CN2021124642W WO2023060633A1 WO 2023060633 A1 WO2023060633 A1 WO 2023060633A1 CN 2021124642 W CN2021124642 W CN 2021124642W WO 2023060633 A1 WO2023060633 A1 WO 2023060633A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
word
vector matrix
information
speech
Prior art date
Application number
PCT/CN2021/124642
Other languages
English (en)
French (fr)
Inventor
陈永红
张日
张军涛
Original Assignee
深圳前海环融联易信息科技服务有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海环融联易信息科技服务有限公司 filed Critical 深圳前海环融联易信息科技服务有限公司
Publication of WO2023060633A1 publication Critical patent/WO2023060633A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/268Morphological analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks

Definitions

  • the present application relates to the technical field of natural language processing, and in particular to a semantically enhanced relationship extraction method, device, computer equipment and storage medium.
  • the purpose of this application is to provide a semantically enhanced relational extraction method, device, computer equipment and storage medium, aiming to solve the problem that the effect of relational extraction still needs to be improved due to insufficient parameter learning in existing relational extraction tasks.
  • the first vector matrix is input into the intermediate layer of the transformer model for entity classification, and the output includes the second vector matrix of entity category information;
  • the second vector matrix is input to the top layer of the transformer model to analyze the dependency relationship, and the output includes the third vector matrix of the sentence structure information and the dependency relationship between words;
  • the technical problem to be solved in this application is to provide a semantically enhanced relationship extraction device, which is characterized in that it includes:
  • the part-of-speech classification unit is used to input the original vector matrix of the sentence into the bottom layer of the transformer model for part-of-speech classification, and output the first vector matrix that includes part-of-speech information;
  • the entity classification unit is used to input the first vector matrix into the intermediate layer of the transformer model for entity classification, and output the second vector matrix containing entity category information;
  • the dependency analysis unit is used to input the second vector matrix into the top layer of the transformer model to analyze the dependency, and output the third vector matrix that includes the sentence structure information and the dependency relationship between words;
  • Convolution unit for splicing the word vector in the third vector matrix and the word position vector corresponding to the word vector in the original vector matrix, convolving the spliced vector and outputting the relationship prediction of the sentence value.
  • an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and operable on the processor.
  • the processor executes the computer program, Realize the semantically enhanced relation extraction method described in the first aspect above.
  • an embodiment of the present application also provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor performs the above-mentioned first aspect.
  • the described semantically enhanced relation extraction method is also provided.
  • the embodiment of the present application discloses a semantically enhanced relationship extraction method, device, computer equipment and storage medium.
  • the method comprises: inputting the original vector matrix of the sentence into the bottom layer of the transformer model for part-of-speech classification, and outputting the first vector matrix containing part-of-speech information; inputting the first vector matrix into the middle layer of the transformer model for entity classification, and outputting the The second vector matrix of entity category information; the second vector matrix is input to the highest layer of the transformer model for dependency analysis, and the third vector matrix containing sentence structure information and inter-word dependencies is output.
  • different learning tasks are added in different stages of the transformer model, so that the output of the transformer model contains part-of-speech information, entity category information, sentence structure information and inter-word dependencies, and has a better effect on the relationship extraction task. good points.
  • FIG. 1 is a schematic flow diagram of semantically enhanced relationship extraction provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of a sub-flow of semantically enhanced relationship extraction provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of another sub-flow of semantically enhanced relationship extraction provided by the embodiment of the present application.
  • FIG. 4 is a schematic diagram of another sub-flow of semantically enhanced relationship extraction provided by the embodiment of the present application.
  • FIG. 5 is a schematic block diagram of a semantically enhanced relationship device provided by an embodiment of the present application.
  • Fig. 6 is a schematic block diagram of a computer device provided by an embodiment of the present application.
  • FIG. 1 is a schematic flow diagram of a semantically enhanced relationship extraction method provided by an embodiment of the present application
  • the method includes steps S101-S104.
  • the part-of-speech information of the sentence is learned at the bottom layer of the transformer model, focusing on the learning of the most fine-grained structural information of the sentence, which can effectively improve the extraction of global information by the transformer model.
  • the category information of the entities in the sentence is learned in the middle layer of the transformer model, and the category information of the entities has a great role in promoting the parsing of the entity relationship.
  • the structural information of the sentence and the dependency relationship between words are learned at the highest level of the transformer model, which can also help the sentence to further filter irrelevant information and obtain more accurate context information.
  • the CNN convolutional neural network is used to convolve the vectors of the sentences learned by different layers to achieve a combination of local and global, which can better extract the relationship between entities.
  • different learning tasks are added at different stages of the transformer model, so that the output of the transformer model contains part-of-speech information, entity category information, sentence structure information and inter-word dependencies, and has a better effect on the relationship extraction task.
  • the step S101 includes:
  • each word in the sentence has a part of speech, such as noun, verb, adjective, adverb, etc.
  • the parts of speech between words also have a relationship of mutual influence, such as adjectives before nouns, adverbs before verbs, etc.
  • the bottom layer of the transformer model After the sentence is segmented, output the part-of-speech vector information of each word, and at the same time classify each word according to the pre-set multi-classification task, and put a part-of-speech label on each word, for example
  • the sentence people have traveled into space, but not the deep ocean; the label of the word ⁇ people, traveled, space ⁇ is ⁇ NNS, VBN, NNP ⁇ ; finally the first vector matrix containing part-of-speech information is obtained and used as The input of the intermediate layer of the subsequent transformer model.
  • the calculation formula for predicting the part-of-speech category of each word in the step S202 is:
  • i-th part-of-speech category is the probability that the i-th word is the i-th part-of-speech category
  • W pos and b pos are the parameters of the linear classifier
  • Z i is the part-of-speech vector information of the i-th word in the original vector matrix.
  • the step S102 includes:
  • the entity class of an entity pair is calculated as follows:
  • a second vector matrix including entity category information is obtained according to the entity categories of the entity pairs.
  • the two entities in the sentence are extracted separately to make an entity classification, and it is judged whether it is a person name, a place name, or an organization name or other entity classes.
  • the entity type of the entity pair ⁇ people, space ⁇ is ⁇ Group, Location ⁇ ; the specific judgment method is based on the calculation formula of the entity category of the above entity pair, Bring Z′ i , W ent and b ent into the calculation formula, calculate and obtain the probability that the i-th word is various entity categories, and select the entity category corresponding to the highest probability value as the entity category of the i-th word. Finally, the second vector matrix containing entity category information is obtained and used as the input of the highest layer of the subsequent transformer model.
  • the step S103 includes:
  • the Bi-affine method (biaffine attention mechanism) is used to learn the dependency tree of the sentence, so that the transformer model can well learn the structural information of the sentence and
  • the dependency relationship between words taking the aforementioned sentence: people have traveled into space, but not the deep ocean as an example, the minimum dependency path between two entities is ⁇ people ⁇ travelled ⁇ into ⁇ space ⁇ , and more relevant information is obtained , and the second half of the sentence "but not the deep ocean" is less helpful for the analysis of the relationship. It can be seen that the establishment of a relational dependency tree can help sentences further filter irrelevant information and obtain more accurate contextual information.
  • the parent node in the dependency path of each word in the second vector matrix is calculated according to the following formula:
  • Z′′′ i , b GR and W GR are brought into the above calculation formula to calculate and obtain the probability of each parent node in the dependency path of the i-th word, and select the parent node corresponding to the highest probability value as the parent node of the i-th word.
  • the third vector matrix containing sentence structure information and inter-word dependencies is obtained as the final output of the transformer model.
  • each layer of the transformer model is composed of multi-head self-attention (self-attention mechanism).
  • Self-attention has nothing to do with the distance between words when learning the weight between words.
  • Each word can obtain the global information of the sentence. .
  • the step S104 includes:
  • the word vector in the third vector matrix and the word position vector corresponding to the word vector in the original vector matrix are spliced according to the following formula:
  • the spliced vector increases the spatial position information of the sentence, and CNN convolution is performed on the spliced vector.
  • the CNN convolution operation uses the convolution kernel to fuse the information in the sliding window, that is, only the information of adjacent words is aggregated, and the local and The global combination can better extract the relationship between entities; after the convolution operation, input the maximum value pooling layer for pooling, input the pooled vector into the softmax function, and output the relationship prediction value of the sentence.
  • the embodiment of the present application also provides a semantically enhanced relationship extraction device, which is used to implement any embodiment of the aforementioned semantically enhanced relationship extraction method.
  • a semantically enhanced relationship extraction device which is used to implement any embodiment of the aforementioned semantically enhanced relationship extraction method.
  • FIG. 5 is a schematic block diagram of a semantically enhanced relationship extraction device provided by an embodiment of the present application.
  • the semantically enhanced relationship extraction device 500 includes: a part-of-speech classification unit 501 , an entity classification unit 502 , a dependency analysis unit 503 and a convolution unit 504 .
  • the part-of-speech classification unit 501 is used to input the original vector matrix of the sentence into the bottom layer of the transformer model to classify the part-of-speech, and output the first vector matrix that contains the part-of-speech information;
  • An entity classification unit 502 configured to input the first vector matrix into the intermediate layer of the transformer model for entity classification, and output a second vector matrix containing entity category information;
  • the dependency analysis unit 503 is used to input the second vector matrix into the top layer of the transformer model for dependency analysis, and output the third vector matrix that includes sentence structure information and inter-word dependencies;
  • the convolution unit 504 is used to concatenate the word vector in the third vector matrix and the word position vector corresponding to the word vector in the original vector matrix, convolve the concatenated vector and output the relation of the sentence Predictive value.
  • the device adds different learning tasks at different stages of the transformer model, so that the output of the transformer model contains part-of-speech information, entity category information, sentence structure information, and inter-word dependency, and has the advantage of achieving better results in relation extraction tasks .
  • the above-mentioned semantically enhanced relationship extraction apparatus can be implemented in the form of a computer program, and the computer program can run on the computer device as shown in FIG. 6 .
  • FIG. 6 is a schematic block diagram of a computer device provided by an embodiment of the present application.
  • the computer device 600 is a server, and the server may be an independent server or a server cluster composed of multiple servers.
  • the computer device 600 includes a processor 602 connected through a system bus 601 , a memory and a network interface 605 , where the memory may include a non-volatile storage medium 603 and an internal memory 604 .
  • the non-volatile storage medium 603 can store an operating system 6031 and a computer program 6032 .
  • the computer program 6032 When executed, it can cause the processor 602 to execute a semantically enhanced relation extraction method.
  • the processor 602 is used to provide calculation and control capabilities to support the operation of the entire computer device 600 .
  • the internal memory 604 provides an environment for running the computer program 6032 in the non-volatile storage medium 603.
  • the processor 602 can execute a semantically enhanced relation extraction method.
  • the network interface 605 is used for network communication, such as providing transmission of data information and the like.
  • the network interface 605 is used for network communication, such as providing transmission of data information and the like.
  • FIG. 6 is only a block diagram of a partial structure related to the solution of this application, and does not constitute a limitation on the computer device 600 on which the solution of this application is applied.
  • the specific computer device 600 may include more or fewer components than shown, or combine certain components, or have a different arrangement of components.
  • the embodiment of the computer device shown in FIG. 6 does not constitute a limitation on the specific composition of the computer device.
  • the computer device may include more or less components than those shown in the illustration. Or combine certain components, or different component arrangements.
  • the computer device may only include a memory and a processor. In such an embodiment, the structures and functions of the memory and the processor are consistent with those of the embodiment shown in FIG. 6 , and will not be repeated here.
  • the processor 602 may be a central processing unit (Central Processing Unit, CPU), and the processor 602 may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), Application Specific Integrated Circuit (ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • a computer readable storage medium may be a non-volatile computer-readable storage medium.
  • the computer-readable storage medium stores a computer program, wherein when the computer program is executed by a processor, the semantically enhanced relation extraction method of the embodiment of the present application is implemented.
  • the storage medium is a physical, non-transitory storage medium, such as a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a magnetic disk or an optical disk, and other physical storages that can store program codes. medium.
  • a physical, non-transitory storage medium such as a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a magnetic disk or an optical disk, and other physical storages that can store program codes. medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

一种增强语义的关系抽取方法、装置、计算机设备及存储介质。该方法包括:将句子的原向量矩阵输入transformer模型的最底层中进行词性分类,输出包含有词性信息的第一向量矩阵(S101);将第一向量矩阵输入transformer模型的中间层进行实体分类,输出包含有实体类别信息的第二向量矩阵(S102);将第二向量矩阵输入transformer模型的最高层进行依存关系解析,输出包含有句子结构信息和词语间依存关系的第三向量矩阵(S103)。上述方法在transformer模型的不同阶段加入了不同的学习任务,使得transformer模型的输出蕴含了词性信息、实体类别信息、句子结构信息和词语间依存关系,具有在关系抽取任务上取得的效果更好的优点。

Description

增强语义的关系抽取方法、装置、计算机设备及存储介质
本申请是以申请号为202111188258.5、申请日为2021年10月12日的中国专利申请为基础,并主张其优先权,该申请的全部内容在此作为整体引入本申请中。
技术领域
本申请涉及自然语言处理技术领域,尤其涉及一种增强语义的关系抽取方法、装置、计算机设备及存储介质。
背景技术
在关系抽取任务中,现有的方法大多都是直接使用实体对的embedding(用一个低维的向量表示一个物体)信息结合句子的embedding信息做分类,这些embedding信息非常有限,忽略了实体对本身的所属类型信息、句子的词性信息以及依存关系树的结构信息,这些信息对于实体对的关系指向有着很重要的作用,但仅仅依靠模型很难学习到。
申请内容
本申请的目的是提供一种增强语义的关系抽取方法、装置、计算机设备及存储介质,旨在解决现有关系抽取任务中的参数学习不足,导致关系抽取的效果还有待提高的问题。
为解决上述技术问题,本申请的目的是通过以下技术方案实现的:提供一种增强语义的关系抽取方法,其包括:
将句子的原向量矩阵输入transformer模型的最底层中进行词性分类,输出包含有词性信息的第一向量矩阵;
将所述第一向量矩阵输入transformer模型的中间层进行实体分类,输出包含有实体类别信息的第二向量矩阵;
将所述第二向量矩阵输入transformer模型的最高层进行依存关系解析,输出包含有句子结构信息和词语间依存关系的第三向量矩阵;
将所述第三向量矩阵中的词向量和该词向量对应在所述原向量矩阵中的词位置向量进行拼接,对拼接后的向量进行卷积并输出句子的关系预测值。
另外,本申请要解决的技术问题是还在于提供一种增强语义的关系抽取装 置,其特征在于,包括:
词性分类单元,用于将句子的原向量矩阵输入transformer模型的最底层中进行词性分类,输出包含有词性信息的第一向量矩阵;
实体分类单元,用于将所述第一向量矩阵输入transformer模型的中间层进行实体分类,输出包含有实体类别信息的第二向量矩阵;
依存关系解析单元,用于将所述第二向量矩阵输入transformer模型的最高层进行依存关系解析,输出包含有句子结构信息和词语间依存关系的第三向量矩阵;
卷积单元,用于将所述第三向量矩阵中的词向量和该词向量对应在所述原向量矩阵中的词位置向量进行拼接,对拼接后的向量进行卷积并输出句子的关系预测值。
另外,本申请实施例又提供了一种计算机设备,其包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述第一方面所述的增强语义的关系抽取方法。
另外,本申请实施例还提供了一种计算机可读存储介质,其中所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行上述第一方面所述的增强语义的关系抽取方法。
本申请实施例公开了一种增强语义的关系抽取方法、装置、计算机设备及存储介质。该方法包括:将句子的原向量矩阵输入transformer模型的最底层中进行词性分类,输出包含有词性信息的第一向量矩阵;将第一向量矩阵输入transformer模型的中间层进行实体分类,输出包含有实体类别信息的第二向量矩阵;将第二向量矩阵输入transformer模型的最高层进行依存关系解析,输出包含有句子结构信息和词语间依存关系的第三向量矩阵。本申请实施例在transformer模型的不同阶段加入了不同的学习任务,使得transformer模型的输出蕴含了词性信息、实体类别信息、句子结构信息和词语间依存关系,具有在关系抽取任务上取得的效果更好的优点。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以 根据这些附图获得其他的附图。
图1为本申请实施例提供的增强语义的关系抽取的流程示意图;
图2为本申请实施例提供的增强语义的关系抽取的子流程示意图;
图3为本申请实施例提供的增强语义的关系抽取的又一子流程示意图;
图4为本申请实施例提供的增强语义的关系抽取的又一子流程示意图;
图5为本申请实施例提供的增强语义的关系装置的示意性框图;
图6为本申请实施例提供的计算机设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
请参阅图1,图1为本申请实施例提供的增强语义的关系抽取方法的流程示意图;
如图1所示,该方法包括步骤S101~S104。
S101、将句子的原向量矩阵输入transformer模型的最底层中进行词性分类,输出包含有词性信息的第一向量矩阵。
该步骤中,在transformer模型的最底层学习句子的词性信息,专注于句子最细粒度结构信息的学习,能有效提高transformer模型对全局信息的抽取。
S102、将所述第一向量矩阵输入transformer模型的中间层进行实体分类, 输出包含有实体类别信息的第二向量矩阵。
该步骤中,在transformer模型的中间层学习句子中实体的类别信息,实体的类别信息对于解析出实体关系有着很大的促进作用。
S103、将所述第二向量矩阵输入transformer模型的最高层进行依存关系解析,输出包含有句子结构信息和词语间依存关系的第三向量矩阵。
该步骤中,在transformer模型的最高层学习句子的结构信息以及词语间的依存关系,还能帮助句子对无关信息做进一步的过滤,获取更精准的上下文信息。
S104、将所述第三向量矩阵中的词向量和该词向量对应在所述原向量矩阵中的词位置向量进行拼接,对拼接后的向量进行卷积并输出句子的关系预测值。
该步骤中采用CNN卷积神经网络对经过不同层学习后的句子的向量做卷积,实现局部和全局相结合,能更好的提取实体间的关系。
本实施例在transformer模型的不同阶段加入了不同的学习任务,使得transformer模型的输出蕴含了词性信息、实体类别信息、句子结构信息和词语间依存关系,具有在关系抽取任务上取得的效果更好的优点。
在一实施例中,如图2所示,所述步骤S101包括:
S201、根据所述句子的原向量矩阵对句子进行分词得到每个词语的词性向量信息;
S202、对每个词语进行词性分类得到每个词语的词性类别信息;
S203、根据所述每个词语的词性向量信息和每个词语的词性类别信息得到包含有词性信息的第一向量矩阵。
本实施例中,句子中每个词语都有词性,比如名词、动词、形容词、副词等,词语之间的词性也有着相互影响的关系,如名词前面一般接形容词、动词前面一般是副词等等;在transformer模型的最底层中,对句子进行分词后,输出每个词语的词性向量信息,同时根据预先设置的多分类任务对每个词语进行词性分类,给每个词语打上词性的标签,举例来说,如句子:people have travelled into space,but not the deep ocean;词语{people,travelled,space}的标签为{NNS,VBN,NNP};最终得到包含有词性信息的第一向量矩阵并作为后续transformer模型的中间层的输入。
具体的,所述步骤S202中预测每个词语的词性类别的计算公式为:
Figure PCTCN2021124642-appb-000001
其中,
Figure PCTCN2021124642-appb-000002
为第i个词性类别,
Figure PCTCN2021124642-appb-000003
为第i个词语为第i个词性类别的概率,W pos和b pos为线性分类器的参数,Z i为原向量矩阵中第i个词语的词性向量信息。
将Z i、W pos和b pos的带入上述公式,计算并得到第i个词语为各种词性类别的概率,选取最高概率值对应的词性类别作为第i个词语的词性类别。
在一实施例中,所述步骤S102包括:
根据所述第一向量矩阵对句子进行实体抽取并得到多个实体对;
按如下公式计算实体对的实体类别:
Figure PCTCN2021124642-appb-000004
其中,
Figure PCTCN2021124642-appb-000005
为第i个实体类别,
Figure PCTCN2021124642-appb-000006
为第i个实体对为第i个实体类别的概率,W ent和b ent为线性分类器的参数,Z′ i为第一向量矩阵中第i个实体对的向量信息;
根据所述实体对的实体类别得到包含实体类别信息的第二向量矩阵。
本实施例在transformer模型的中间层中,经过该层多头的self-attention(自注意力机制),将句子中两个实体单独抽出来做一个实体分类,判断其是否是人名、地名、机构名或者其他实体类别。以前述句子:people have travelled into space,but not the deep ocean为例;实体对{people,space}的实体类型为{Group,Location};具体的判断方式按上述实体对的实体类别的计算公式,将Z′ i、W ent和b ent带入该计算公式,计算并得到第i个词语为各种实体类别的概率,选取最高概率值对应的实体类别作为第i个词语的实体类别。最终得到包含有实体类别信息的第二向量矩阵并作为后续transformer模型的最高层的输入。
在一实施例中,如图3所示,所述步骤S103包括:
S301、根据所述第二向量矩阵对句子结构进行解析,得到每个词语对应的语法关系;
S302、根据所述第二向量矩阵对词语间的依存关系进行解析,得到词语间的最小依存路径;
S303、根据所述每个词语对应的语法关系和词语间的最小依存路径,得到包含有句子结构信息和词语间依存关系的第三向量矩阵。
本实施例中,在transformer模型的最高层中,使用Bi-affine方法(双仿射 注意力机制)学习对句子的依存关系树进行学习,使transformer模型能很好的学习到句子的结构信息以及词语间的依存关系,以前述句子:people have travelled into space,but not the deep ocean为例,两个实体间的最小依存路径为{people←travelled→into→space},获取了更多相关的信息,而后半句“but not the deep ocean”,则对于关系的解析帮助较小。由此可见,关系依存树的建立能帮助句子对无关信息做进一步的过滤,获取更精准的上下文信息。
所述步骤S301中每个词语对应的语法关系的计算公式为:
Figure PCTCN2021124642-appb-000007
其中,
Figure PCTCN2021124642-appb-000008
为第i个语法关系,
Figure PCTCN2021124642-appb-000009
为第i个词语为第i个语法关系的概率,W par和b par为线性分类器的参数,Z″ i为第二向量矩阵中第i个词语的向量信息,
Figure PCTCN2021124642-appb-000010
为Z″ i经过Bi-affine操作后的向量信息。
将Z″ i、W par和b par带入上述语法关系的计算公式,计算并得到第i个词语为各种语法关系的概率,选取最高概率值对应的语法关系作为第i个词语的语法关系。
所述步骤S302中,按如下公式计算所述第二向量矩阵中每个词语的依存路径中的父节点:
Figure PCTCN2021124642-appb-000011
其中,
Figure PCTCN2021124642-appb-000012
为第i个父节点,
Figure PCTCN2021124642-appb-000013
为第i个词语为第i个父节点的概率,B GR和W GR和为线性分类器的参数,Z″′ i为第三向量矩阵中第i个词语的向量信息,
Figure PCTCN2021124642-appb-000014
为Z″′ i经过Bi-affine操作后的向量信息。
将Z″′ i
Figure PCTCN2021124642-appb-000015
b GR和W GR带入上述计算公式,计算并得到第i个词语的依存路径中的各个父节点的概率,选取最高概率值对应的父节点作为第i个词语的父节点。最终得到包含有句子结构信息和词语间依存关系的第三向量矩阵作为transformer模型的最终输出。
transformer模型的每一层的结构都是由多头self-attention(自注意机制)组成,self-attention在学习词语之间的权重时与词间距离无关,每一个词语都能获取到句子全局的信息。
在一实施例中,如图4所示,所述步骤S104包括:
S401、将所述第三向量矩阵中的词向量和该词向量对应在所述原向量矩阵中的词位置向量进行拼接;
S402、对拼接后的向量进行卷积操作后输入最大值池化层进行池化;
S403、将池化后的向量输入softmax函数,输出句子的关系预测值。
本实施例中,按如下公式将所述第三向量矩阵中的词向量和该词向量对应在所述原向量矩阵中的词位置向量进行拼接:
Figure PCTCN2021124642-appb-000016
Figure PCTCN2021124642-appb-000017
其中,
Figure PCTCN2021124642-appb-000018
为第i个词向量的和其对应词位置向量的拼接向量,
Figure PCTCN2021124642-appb-000019
为第三向量矩阵中第i词语的词向量,
Figure PCTCN2021124642-appb-000020
表示原向量矩阵中第i个词语的词位置向量,⊕表示对两个向量进行拼接,
Figure PCTCN2021124642-appb-000021
为整个句子的拼接向量,len为句子的长度。
拼接后的向量增加句子的空间位置信息,对拼接后的向量进行CNN卷积,CNN卷积操作使用卷积核对滑动窗口内的信息进行融合,即只对相邻词语的信息做聚合,局部和全局相结合,能更好的提取实体间的关系;卷积操作后输入最大值池化层进行池化,将池化后的向量输入softmax函数,输出句子的关系预测值。
本申请实施例还提供一种增强语义的关系抽取装置,该增强语义的关系抽取装置用于执行前述增强语义的关系抽取法的任一实施例。具体地,请参阅图5,图5是本申请实施例提供的增强语义的关系抽取装置的示意性框图。
如图5所示,增强语义的关系抽取装置500,包括:词性分类单元501、实体分类单元502、依存关系解析单元503以及卷积单元504。
词性分类单元501,用于将句子的原向量矩阵输入transformer模型的最底层中进行词性分类,输出包含有词性信息的第一向量矩阵;
实体分类单元502,用于将所述第一向量矩阵输入transformer模型的中间层进行实体分类,输出包含有实体类别信息的第二向量矩阵;
依存关系解析单元503,用于将所述第二向量矩阵输入transformer模型的最高层进行依存关系解析,输出包含有句子结构信息和词语间依存关系的第三向量矩阵;
卷积单元504,用于将所述第三向量矩阵中的词向量和该词向量对应在所述原向量矩阵中的词位置向量进行拼接,对拼接后的向量进行卷积并输出句子的关系预测值。
该装置在transformer模型的不同阶段加入了不同的学习任务,使得 transformer模型的输出蕴含了词性信息、实体类别信息句子结构信息和词语间依存关系,具有在关系抽取任务上取得的效果更好的优点。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
上述增强语义的关系抽取装置可以实现为计算机程序的形式,该计算机程序可以在如图6所示的计算机设备上运行。
请参阅图6,图6是本申请实施例提供的计算机设备的示意性框图。该计算机设备600是服务器,服务器可以是独立的服务器,也可以是多个服务器组成的服务器集群。
参阅图6,该计算机设备600包括通过系统总线601连接的处理器602、存储器和网络接口605,其中,存储器可以包括非易失性存储介质603和内存储器604。
该非易失性存储介质603可存储操作系统6031和计算机程序6032。该计算机程序6032被执行时,可使得处理器602执行增强语义的关系抽取方法。
该处理器602用于提供计算和控制能力,支撑整个计算机设备600的运行。
该内存储器604为非易失性存储介质603中的计算机程序6032的运行提供环境,该计算机程序6032被处理器602执行时,可使得处理器602执行增强语义的关系抽取方法。
该网络接口605用于进行网络通信,如提供数据信息的传输等。本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备600的限定,具体的计算机设备600可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
本领域技术人员可以理解,图6中示出的计算机设备的实施例并不构成对计算机设备具体构成的限定,在其他实施例中,计算机设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。例如,在一些实施例中,计算机设备可以仅包括存储器及处理器,在这样的实施例中,存储器及处理器的结构及功能与图6所示实施例一致,在此不再赘述。
应当理解,在本申请实施例中,处理器602可以是中央处理单元(Central  Processing Unit,CPU),该处理器602还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
在本申请的另一实施例中提供计算机可读存储介质。该计算机可读存储介质可以为非易失性的计算机可读存储介质。该计算机可读存储介质存储有计算机程序,其中计算机程序被处理器执行时实现本申请实施例的增强语义的关系抽取方法。
所述存储介质为实体的、非瞬时性的存储介质,例如可以是U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的实体存储介质。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的设备、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (10)

  1. 一种增强语义的关系抽取方法,其特征在于,包括:
    将句子的原向量矩阵输入transformer模型的最底层中进行词性分类,输出包含有词性信息的第一向量矩阵;
    将所述第一向量矩阵输入transformer模型的中间层进行实体分类,输出包含有实体类别信息的第二向量矩阵;
    将所述第二向量矩阵输入transformer模型的最高层进行依存关系解析,输出包含有句子结构信息和词语间依存关系的第三向量矩阵;
    将所述第三向量矩阵中的词向量和该词向量对应在所述原向量矩阵中的词位置向量进行拼接,对拼接后的向量进行卷积并输出句子的关系预测值。
  2. 根据权利要求1所述的增强语义的关系抽取方法,其特征在于,所述将句子的原向量矩阵输入transformer模型的最底层中进行词性分类,输出包含有词性信息的第一向量矩阵,包括:
    根据所述句子的原向量矩阵对句子进行分词得到每个词语的词性向量信息;
    对每个词语进行词性分类得到每个词语的词性类别信息;
    根据所述每个词语的词性向量信息和每个词语的词性类别信息得到包含有词性信息的第一向量矩阵。
  3. 根据权利要求2所述的增强语义的关系抽取方法,其特征在于,所述对每个词语进行词性分类得到每个词语的词性类别信息,包括:
    按如下公式预测每个词语的词性类别:
    Figure PCTCN2021124642-appb-100001
    其中,
    Figure PCTCN2021124642-appb-100002
    为第i个词性类别,
    Figure PCTCN2021124642-appb-100003
    为第i个词语为第i个词性类别的概率,W pos和b pos为线性分类器的参数,Z i为原向量矩阵中第i个词语的词性向量信息。
  4. 根据权利要求1所述的增强语义的关系抽取方法,其特征在于,所述将所述第一向量矩阵输入transformer模型的中间层进行实体分类,输出包含有实体类别信息的第二向量矩阵,包括:
    根据所述第一向量矩阵对句子进行实体抽取并得到多个实体对;
    按如下公式计算每一实体对的实体类别:
    Figure PCTCN2021124642-appb-100004
    其中,
    Figure PCTCN2021124642-appb-100005
    为第i个实体类别,
    Figure PCTCN2021124642-appb-100006
    为第i个实体对为第i个实体类别的概率,W ent和b ent为线性分类器的参数,Z′ i为第一向量矩阵中第i个实体对的向量信息;
    根据所述实体对的实体类别得到包含实体类别信息的第二向量矩阵。
  5. 根据权利要求1所述的增强语义的关系抽取方法,其特征在于,所述将所述第二向量矩阵输入transformer模型的最高层进行依存关系解析,输出包含有句子结构信息和词语间依存关系的第三向量矩阵,包括:
    根据所述第二向量矩阵对句子结构进行解析,得到每个词语对应的语法关系;
    根据所述第二向量矩阵对词语间的依存关系进行解析,得到词语间的最小依存路径;
    根据所述每个词语对应的语法关系和词语间的最小依存路径,得到包含有句子结构信息和词语间依存关系的第三向量矩阵。
  6. 根据权利要求5所述的增强语义的关系抽取方法,其特征在于,所述根据所述第二向量矩阵对句子结构进行解析,得到每个词语对应的语法关系,包括:
    按如下公式计算所述第二向量矩阵中每个词语的语法关系:
    Figure PCTCN2021124642-appb-100007
    其中,
    Figure PCTCN2021124642-appb-100008
    为第i个语法关系,
    Figure PCTCN2021124642-appb-100009
    为第i个词语为第i个语法关系的概率,W par和b par为线性分类器的参数,Z″ i为第二向量矩阵中第i个词语的向量信息,
    Figure PCTCN2021124642-appb-100010
    为Z″ i经过Bi-affine操作后的向量信息;
    按如下公式计算所述第二向量矩阵中每个词语的依存路径中的父节点:
    Figure PCTCN2021124642-appb-100011
    其中,
    Figure PCTCN2021124642-appb-100012
    为第i个父节点,
    Figure PCTCN2021124642-appb-100013
    为第i个词语为第i个父节点的概率,b GR和W GR和为线性分类器的参数,Z″′ i为第三向量矩阵中第i个词语的向量信息,
    Figure PCTCN2021124642-appb-100014
    为Z″′ i经过Bi-affine操作后的向量信息。
  7. 根据权利要求1所述的增强语义的关系抽取方法,其特征在于,所述将所述第三向量矩阵中的词向量和该词向量对应在所述原向量矩阵中的词位置向量进行拼接,对拼接后的向量进行卷积并输出句子的关系预测值,包括:
    将所述第三向量矩阵中的词向量和该词向量对应在所述原向量矩阵中的词位置向量进行拼接;
    对拼接后的向量进行卷积操作后输入最大值池化层进行池化;
    将池化后的向量输入softmax函数,输出句子的关系预测值。
  8. 一种增强语义的关系抽取装置,其特征在于,包括:
    词性分类单元,用于将句子的原向量矩阵输入transformer模型的最底层中进行词性分类,输出包含有词性信息的第一向量矩阵;
    实体分类单元,用于将所述第一向量矩阵输入transformer模型的中间层进行实体分类,输出包含有实体类别信息的第二向量矩阵;
    依存关系解析单元,用于将所述第二向量矩阵输入transformer模型的最高层进行依存关系解析,输出包含有句子结构信息和词语间依存关系的第三向量矩阵;
    卷积单元,用于将所述第三向量矩阵中的词向量和该词向量对应在所述原向量矩阵中的词位置向量进行拼接,对拼接后的向量进行卷积并输出句子的关系预测值。
  9. 一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至7中任一项所述的增强语义的关系抽取的方法。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行如权利要求1至7任一项所述的增强语义的关系抽取的方法。
PCT/CN2021/124642 2021-10-12 2021-10-19 增强语义的关系抽取方法、装置、计算机设备及存储介质 WO2023060633A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111188258.5A CN113626608B (zh) 2021-10-12 2021-10-12 增强语义的关系抽取方法、装置、计算机设备及存储介质
CN202111188258.5 2021-10-12

Publications (1)

Publication Number Publication Date
WO2023060633A1 true WO2023060633A1 (zh) 2023-04-20

Family

ID=78391160

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/124642 WO2023060633A1 (zh) 2021-10-12 2021-10-19 增强语义的关系抽取方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN113626608B (zh)
WO (1) WO2023060633A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116405326A (zh) * 2023-06-07 2023-07-07 厦门瞳景智能科技有限公司 基于区块链的信息安全管理方法及其系统
CN117521656A (zh) * 2023-11-30 2024-02-06 成都信息工程大学 一种面向中文文本的端到端中文实体关系联合抽取方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027309A (zh) * 2019-12-05 2020-04-17 电子科技大学广东电子信息工程研究院 一种基于双向长短期记忆网络的实体属性值的抽取方法
CN112989796A (zh) * 2021-03-10 2021-06-18 北京大学 一种基于句法指导的文本命名实体信息识别方法
CN113239186A (zh) * 2021-02-26 2021-08-10 中国科学院电子学研究所苏州研究院 一种基于多依存关系表示机制的图卷积网络关系抽取方法
WO2021164226A1 (zh) * 2020-02-20 2021-08-26 平安科技(深圳)有限公司 法律案件知识图谱查询方法、装置、设备及存储介质
WO2021174774A1 (zh) * 2020-07-30 2021-09-10 平安科技(深圳)有限公司 神经网络关系抽取方法、计算机设备及可读存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2281331A1 (en) * 1999-09-03 2001-03-03 Cognos Incorporated Database management system
CN111222317B (zh) * 2019-10-16 2022-04-29 平安科技(深圳)有限公司 序列标注方法、系统和计算机设备
CN111581361B (zh) * 2020-04-22 2023-09-15 腾讯科技(深圳)有限公司 一种意图识别方法及装置
CN112084793B (zh) * 2020-09-14 2024-05-14 深圳前海微众银行股份有限公司 基于依存句法的语义识别方法、设备和可读存储介质
CN113221539B (zh) * 2021-07-08 2021-09-24 华东交通大学 一种集成句法信息的嵌套命名实体识别方法与系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027309A (zh) * 2019-12-05 2020-04-17 电子科技大学广东电子信息工程研究院 一种基于双向长短期记忆网络的实体属性值的抽取方法
WO2021164226A1 (zh) * 2020-02-20 2021-08-26 平安科技(深圳)有限公司 法律案件知识图谱查询方法、装置、设备及存储介质
WO2021174774A1 (zh) * 2020-07-30 2021-09-10 平安科技(深圳)有限公司 神经网络关系抽取方法、计算机设备及可读存储介质
CN113239186A (zh) * 2021-02-26 2021-08-10 中国科学院电子学研究所苏州研究院 一种基于多依存关系表示机制的图卷积网络关系抽取方法
CN112989796A (zh) * 2021-03-10 2021-06-18 北京大学 一种基于句法指导的文本命名实体信息识别方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116405326A (zh) * 2023-06-07 2023-07-07 厦门瞳景智能科技有限公司 基于区块链的信息安全管理方法及其系统
CN116405326B (zh) * 2023-06-07 2023-10-20 厦门瞳景智能科技有限公司 基于区块链的信息安全管理方法及其系统
CN117521656A (zh) * 2023-11-30 2024-02-06 成都信息工程大学 一种面向中文文本的端到端中文实体关系联合抽取方法
CN117521656B (zh) * 2023-11-30 2024-05-07 成都信息工程大学 一种面向中文文本的端到端中文实体关系联合抽取方法

Also Published As

Publication number Publication date
CN113626608B (zh) 2022-02-15
CN113626608A (zh) 2021-11-09

Similar Documents

Publication Publication Date Title
CN108363790B (zh) 用于对评论进行评估的方法、装置、设备和存储介质
US20220050967A1 (en) Extracting definitions from documents utilizing definition-labeling-dependent machine learning background
US10061766B2 (en) Systems and methods for domain-specific machine-interpretation of input data
US20220171936A1 (en) Analysis of natural language text in document
CN108875059B (zh) 用于生成文档标签的方法、装置、电子设备和存储介质
WO2023060633A1 (zh) 增强语义的关系抽取方法、装置、计算机设备及存储介质
US9760626B2 (en) Optimizing parsing outcomes of documents
US20240111956A1 (en) Nested named entity recognition method based on part-of-speech awareness, device and storage medium therefor
CN110162771A (zh) 事件触发词的识别方法、装置、电子设备
CN117076653A (zh) 基于思维链及可视化提升上下文学习知识库问答方法
CN111061876B (zh) 事件舆情数据分析方法及装置
WO2022116444A1 (zh) 文本分类方法、装置、计算机设备和介质
WO2023103914A1 (zh) 文本情感分析方法、装置及计算机可读存储介质
WO2023093909A1 (zh) 一种工作流节点推荐方法及装置
CN116561320A (zh) 一种汽车评论的分类方法、装置、设备及介质
CN116414988A (zh) 基于依赖关系增强的图卷积方面级情感分类方法及系统
CN110377753A (zh) 基于关系触发词与gru模型的关系抽取方法及装置
CN112529743B (zh) 合同要素抽取方法、装置、电子设备及介质
CN114490946A (zh) 基于Xlnet模型的类案检索方法、系统及设备
Yu et al. Information Security Field Event Detection Technology Based on SAtt‐LSTM
WO2021056740A1 (zh) 语言模型构建方法、系统、计算机设备及可读存储介质
US9251135B2 (en) Correcting N-gram probabilities by page view information
Jiang et al. Automatic adaptation of annotations
US10169074B2 (en) Model driven optimization of annotator execution in question answering system
CN113971216B (zh) 数据处理方法、装置、电子设备和存储器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21960362

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE