CN116362252A - Entity relationship identification method and device, electronic equipment and storage medium - Google Patents

Entity relationship identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116362252A
CN116362252A CN202310405543.0A CN202310405543A CN116362252A CN 116362252 A CN116362252 A CN 116362252A CN 202310405543 A CN202310405543 A CN 202310405543A CN 116362252 A CN116362252 A CN 116362252A
Authority
CN
China
Prior art keywords
entity
tail
vector
head
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310405543.0A
Other languages
Chinese (zh)
Inventor
陈焕坤
王伟
曾志贤
张黔
张兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Resources Digital Technology Co Ltd
Original Assignee
China Resources Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Resources Digital Technology Co Ltd filed Critical China Resources Digital Technology Co Ltd
Priority to CN202310405543.0A priority Critical patent/CN116362252A/en
Publication of CN116362252A publication Critical patent/CN116362252A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the application provides a method and a device for identifying entity relationships, electronic equipment and a storage medium, and belongs to the technical field of artificial intelligence. The method comprises the steps of obtaining sample identification text; obtaining a sample text vector according to the encoder; extracting features of the header entity marks according to the decoder to obtain header mark vectors; performing entity extraction on the sample text vector according to the entity extraction layer to obtain a prediction head entity; obtaining a tail marker vector according to the decoder; carrying out entity extraction on the sample text vector according to the entity extraction layer to obtain a prediction tail entity; carrying out entity relation extraction on the predicted head entity and the predicted tail entity according to the relation prediction layer to obtain a predicted entity relation; adjusting the original entity relationship recognition model to obtain a target entity relationship recognition model; and carrying out entity relationship recognition on the target recognition text according to the target entity relationship recognition model. According to the embodiment of the application, the search space can be reduced, the decoding speed is improved, and then the entity relation determining speed is improved.

Description

Entity relationship identification method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method and apparatus for identifying entity relationships, an electronic device, and a storage medium.
Background
Information extraction techniques in natural language processing (natural language processing, NLP) include tasks such as entity extraction and relationship extraction. The entity extraction task is to identify an entity with a specific meaning in a text, such as a date, a place name and the like, and the relation extraction task is to identify a relation between two entities. In the related art, the relation extraction task can be divided into two types of extraction type and generation type according to different models, wherein the target text which is generated by gradual iteration is required in the decoding process in the generation type method, so that the decoding speed is low, the search space of the generation type method is of a vocabulary size, and the exposure deviation is easy to be caused by the overlarge search space. Therefore, how to provide a method capable of reducing the search space in the decoding stage and improving the decoding speed to improve the determination speed of the entity relationship is a technical problem to be solved.
Disclosure of Invention
The embodiment of the application mainly aims to provide a method and a device for identifying entity relations, electronic equipment and a storage medium, aiming to reduce search space, improve decoding speed and further improve entity relation determining speed.
To achieve the above object, a first aspect of an embodiment of the present application provides an entity relationship identifying method, where the method includes:
Acquiring a sample identification text, wherein the sample identification text comprises a sample head entity, a sample tail entity and a sample entity relation between the sample head entity and the sample tail entity;
inputting the sample recognition text into a preset original entity relationship recognition model, wherein the original entity relationship recognition model comprises an encoder, a decoder, a relationship prediction layer and an entity extraction layer;
extracting features of the sample recognition text according to the encoder to obtain a sample text vector;
extracting features of a preset head entity mark according to the decoder to obtain a head mark vector;
performing entity extraction on the sample text vector according to the entity extraction layer and the header mark vector to obtain a prediction header entity;
performing feature extraction on a preset tail entity mark, the predicted head entity and the head entity mark according to the decoder to obtain a tail mark vector;
performing entity extraction on the sample text vector according to the entity extraction layer and the tail mark vector to obtain a predicted tail entity;
extracting entity relations from the prediction head entity and the prediction tail entity according to the relation prediction layer to obtain a predicted entity relation;
Parameter adjustment is carried out on the original entity relationship recognition model according to the sample head entity, the sample entity relationship, the prediction head entity and the prediction entity relationship to obtain a target entity relationship recognition model;
and carrying out entity relationship recognition on the obtained target recognition text according to the target entity relationship recognition model.
In some embodiments, the sample text vector includes an entity first position vector and an entity last position vector;
the feature extraction is carried out on the sample recognition text according to the encoder, and sample text vectors comprise:
extracting features of the sample recognition text according to the encoder to obtain a preliminary text vector;
extracting entity first position features of the preliminary text vector according to a preset first full-connection layer to obtain the entity first position vector;
and extracting the entity tail position features of the preliminary text vector according to a preset second full-connection layer to obtain the entity tail position vector.
In some embodiments, the entity extracting the sample text vector according to the entity extracting layer and the header label vector to obtain a predicted header entity includes:
Performing attention calculation on the entity head position vector according to the head mark vector to obtain a first prediction head position vector;
performing attention calculation on the entity tail position vector according to the head mark vector to obtain a first predicted tail position vector;
and carrying out entity extraction on the preliminary text vector according to the entity extraction layer, the first prediction head position vector and the first prediction tail position vector to obtain the prediction head entity.
In some embodiments, the entity extraction of the sample text vector according to the entity extraction layer and the tail flag vector to obtain a predicted tail entity includes:
performing attention calculation on the entity head position vector according to the tail mark vector to obtain a second prediction head position vector;
performing attention calculation on the entity tail position vector according to the tail mark vector to obtain a second predicted tail position vector;
and carrying out entity extraction on the preliminary text vector according to the entity extraction layer, the second prediction head position vector and the second prediction tail position vector to obtain the prediction tail entity.
In some embodiments, the extracting the entity relationship of the prediction head entity and the prediction tail entity according to the relationship prediction layer to obtain a predicted entity relationship includes:
Performing feature extraction on a preset entity relation mark, the predicted tail entity, the tail entity mark, the predicted head entity and the head entity mark according to the decoder to obtain a relation mark vector;
and extracting the entity relationship from the relationship marking vector according to the relationship prediction layer to obtain the predicted entity relationship.
In some embodiments, the performing parameter adjustment on the original entity relationship recognition model according to the sample head entity, the sample entity relationship, the predicted head entity, and the predicted entity relationship to obtain a target entity relationship recognition model includes:
performing first position loss calculation according to the first predicted first position vector and a preset sample first position vector to obtain first position loss;
performing tail position loss calculation according to the first predicted tail position vector and a preset sample tail position vector to obtain tail position loss;
calculating entity relation loss according to the sample entity relation and the predicted entity relation to obtain relation loss;
and carrying out parameter adjustment on the original entity relationship recognition model according to the head position loss, the tail position loss and the relationship loss to obtain the target entity relationship recognition model.
In some embodiments, before the entity relationship recognition is performed on the obtained target recognition text according to the target entity relationship recognition model, the method further includes:
splicing the head entity mark, the prediction head entity, the tail entity mark, the prediction tail entity, the entity relation mark and the prediction entity relation to obtain initial relation data;
if the initial relation data is inconsistent with the preset relation data, circularly executing preset relation identification operation until the updated initial relation data is consistent with the preset relation data;
wherein the relationship identification operation includes:
splicing the head entity mark with the initial relation data to obtain preliminary prediction data;
extracting features of the preliminary prediction data according to the decoder to obtain a preliminary head vector;
performing entity extraction on the sample text vector according to the entity extraction layer and the preliminary head vector to obtain a preliminary head entity;
performing feature extraction on the tail entity mark, the preliminary head entity and the preliminary prediction data according to the decoder to obtain a preliminary tail vector;
Performing entity extraction on the sample text vector according to the entity extraction layer and the preliminary tail vector to obtain a preliminary tail entity;
performing feature extraction on the entity relation mark, the preliminary tail entity, the tail entity mark, the preliminary head entity and the preliminary prediction data according to the decoder to obtain a preliminary relation vector;
extracting entity relations from the preliminary relation vector according to the relation prediction layer to obtain a preliminary entity relation;
and updating the initial relation data according to the initial entity relation, the entity relation mark, the initial tail entity, the tail entity mark, the initial head entity and the head entity mark.
To achieve the above object, a second aspect of the embodiments of the present application proposes an entity relationship identifying apparatus, including:
the system comprises a text acquisition module, a text recognition module and a text recognition module, wherein the text recognition module is used for acquiring a sample recognition text, the sample recognition text comprises a sample head entity, a sample tail entity and a sample entity relation between the sample head entity and the sample tail entity;
the data input module is used for inputting the sample recognition text into a preset original entity relationship recognition model, wherein the original entity relationship recognition model comprises an encoder, a decoder, a relationship prediction layer and an entity extraction layer;
The first feature extraction module is used for carrying out feature extraction on the sample identification text according to the encoder, and a sample text vector;
the second feature extraction module is used for carrying out feature extraction on a preset head entity mark according to the decoder to obtain a head mark vector;
the first entity extraction module is used for carrying out entity extraction on the sample text vector according to the entity extraction layer and the header mark vector to obtain a prediction header entity;
the third feature extraction module is used for carrying out feature extraction on the preset tail entity mark, the predicted head entity and the head entity mark according to the decoder to obtain a tail mark vector;
the second entity extraction module is used for carrying out entity extraction on the sample text vector according to the entity extraction layer and the tail mark vector to obtain a predicted tail entity;
the entity relation extraction module is used for extracting entity relation of the prediction head entity and the prediction tail entity according to the relation prediction layer to obtain a predicted entity relation;
the parameter adjustment module is used for carrying out parameter adjustment on the original entity relationship identification model according to the sample head entity, the sample entity relationship, the prediction head entity and the prediction entity relationship to obtain a target entity relationship identification model;
And the entity relationship recognition module is used for carrying out entity relationship recognition on the obtained target recognition text according to the target entity relationship recognition model.
To achieve the above object, a third aspect of the embodiments of the present application proposes an electronic device, which includes a memory and a processor, the memory storing a computer program, the processor implementing the method according to the first aspect when executing the computer program.
To achieve the above object, a fourth aspect of the embodiments of the present application proposes a computer-readable storage medium storing a computer program that, when executed by a processor, implements the method of the first aspect.
According to the entity relation identification method and device, the electronic equipment and the storage medium, the prediction head entity in the sample text vector is determined through the head mark vector, and the prediction tail entity in the sample text vector is determined through the tail mark vector, so that the head entity selection is carried out on the sample text vector through the head mark vector, the tail entity selection is carried out on the sample text vector through the tail mark vector, the method for directly predicting the head entity and the tail entity in the related technology is avoided, the size of a search space is reduced to a certain extent, and the decoding speed is improved. Therefore, when the entity relationship recognition is performed on the target recognition text according to the target entity relationship recognition model, the determination speed of the entity relationship can be improved.
Drawings
FIG. 1 is a flowchart of a method for identifying entity relationships provided in an embodiment of the present application;
fig. 2 is a flowchart of step S103 in fig. 1;
fig. 3 is a flowchart of step S105 in fig. 1;
fig. 4 is a flowchart of step S107 in fig. 1;
fig. 5 is a flowchart of step S108 in fig. 1;
fig. 6 is a flowchart of step S109 in fig. 1;
FIG. 7A is a flowchart of one embodiment of a method for entity relationship identification provided by embodiments of the present application;
FIG. 7B is a flowchart of a relationship identification operation in the entity relationship identification method according to the embodiment of the present application;
fig. 8 is a schematic structural diagram of an entity relationship identifying apparatus according to an embodiment of the present application;
fig. 9 is a schematic hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.
First, several nouns referred to in this application are parsed:
artificial intelligence (artificial intelligence, AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding the intelligence of people; artificial intelligence is a branch of computer science that attempts to understand the nature of intelligence and to produce a new intelligent machine that can react in a manner similar to human intelligence, research in this field including robotics, language recognition, image recognition, natural language processing, and expert systems. Artificial intelligence can simulate the information process of consciousness and thinking of people. Artificial intelligence is also a theory, method, technique, and application system that utilizes a digital computer or digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
Natural language processing (natural language processing, NLP): NLP is a branch of artificial intelligence that is a interdisciplinary of computer science and linguistics, and is often referred to as computational linguistics, and is processed, understood, and applied to human languages (e.g., chinese, english, etc.). Natural language processing includes parsing, semantic analysis, chapter understanding, and the like. Natural language processing is commonly used in the technical fields of machine translation, handwriting and print character recognition, voice recognition and text-to-speech conversion, information intent recognition, information extraction and filtering, text classification and clustering, public opinion analysis and opinion mining, and the like, and relates to data mining, machine learning, knowledge acquisition, knowledge engineering, artificial intelligence research, linguistic research related to language calculation, and the like.
Information extraction (Information Extraction): extracting the fact information of the appointed type of entity, relation, event and the like from the natural language text, and forming the text processing technology of the structured data output. Information extraction is a technique for extracting specific information from text data. Text data is made up of specific units, such as sentences, paragraphs, chapters, and text information is made up of small specific units, such as words, phrases, sentences, paragraphs, or a combination of these specific units. The noun phrase, the name of a person, the name of a place, etc. in the extracted text data are all text information extraction, and of course, the information extracted by the text information extraction technology can be various types of information.
The information extraction technology in NLP includes entity extraction, relation extraction and other tasks. The entity extraction task is to identify an entity with a specific meaning in a text, such as a date, a place name and the like, and the relation extraction task is to identify a relation between two entities. In the related art, the relation extraction task can be divided into two types of extraction type and generation type according to different models, wherein the target text which is generated by gradual iteration is required in the decoding process in the generation type method, so that the decoding speed is low, the search space of the generation type method is of a vocabulary size, and the exposure deviation is easy to be caused by the overlarge search space. Therefore, how to provide a method capable of reducing the search space in the decoding stage and improving the decoding speed to improve the determination speed of the entity relationship is a technical problem to be solved.
Based on this, the embodiment of the application provides a method and a device for identifying entity relationships, an electronic device and a storage medium, which aim to reduce the search space in the decoding stage and improve the decoding speed so as to improve the entity relationship determining speed.
The embodiment of the application provides a method and a device for identifying entity relationships, an electronic device and a storage medium, and specifically, the following embodiment is used for explaining, and first describes a recommendation method in the embodiment of the application.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The embodiment of the application provides an entity relationship identification method, and relates to the technical field of artificial intelligence. The entity relation identification method provided by the embodiment of the application can be applied to the terminal, the server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smart phone, tablet, notebook, desktop, etc.; the server side can be configured as an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms and the like; the software may be an application or the like that implements the entity relationship identification method, but is not limited to the above form.
The subject application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Fig. 1 is an optional flowchart of a method for identifying entity relationships according to an embodiment of the present application, where the method in fig. 1 may include, but is not limited to, steps S101 to S110.
Step S101, a sample identification text is obtained, wherein the sample identification text comprises a sample head entity, a sample tail entity and a sample entity relation between the sample head entity and the sample tail entity;
step S102, inputting a sample recognition text into a preset original entity relation recognition model, wherein the original entity relation recognition model comprises an encoder, a decoder, a relation prediction layer and an entity extraction layer;
step S103, extracting features of the sample recognition text according to the encoder to obtain a sample text vector;
step S104, extracting features of a preset head entity mark according to a decoder to obtain a head mark vector;
step S105, carrying out entity extraction on the sample text vector according to the entity extraction layer and the header label vector to obtain a prediction header entity;
step S106, extracting features of a preset tail entity mark, a predicted head entity and a head entity mark according to a decoder to obtain a tail mark vector;
step S107, performing entity extraction on the sample text vector according to the entity extraction layer and the tail mark vector to obtain a predicted tail entity;
step S108, extracting entity relations of the predicted head entity and the predicted tail entity according to the relation prediction layer to obtain predicted entity relations;
Step S109, parameter adjustment is carried out on the original entity relationship recognition model according to the sample head entity, the sample entity relationship, the prediction head entity and the prediction entity relationship to obtain a target entity relationship recognition model;
step S110, entity relation recognition is carried out on the obtained target recognition text according to the target entity relation recognition model.
In the embodiment of the application, the step S101 to the step S110 are illustrated, the prediction head entity in the sample text vector is determined by the head label vector, and the prediction tail entity in the sample text vector is determined by the tail label vector, so that the head entity selection is performed on the sample text vector by the head label vector, and the tail entity selection is performed on the sample text vector by the tail label vector, thereby avoiding the method of directly predicting the head entity and the tail entity in the related art, further reducing the size of the search space to a certain extent, and improving the decoding speed. Therefore, when the entity relationship recognition is performed on the target recognition text according to the target entity relationship recognition model, the determination speed of the entity relationship can be improved.
In step S101 of some embodiments, the sample recognition text refers to text to be recognized for use as training data. The sample recognition text comprises a plurality of characters, and the sample head entity, the sample tail entity and the sample entity relationship comprise different characters. When the entity A and the entity B have the entity relation C, the entity A is called a head entity, and the entity B is called a tail entity. Taking a sample recognition text as a 'small open word, a song of big red song and singing' as an example, the sample recognition text comprises the following characters: small, bright, make, word, big, red, make, song, merge, play, singe, song, and include sample head entity, sample tail entity, sample head entity, and sample entity relationship to sample tail entity as shown in table 1 below.
Sample head entity Sample tail entity Sample entity relationship
Xiaoming (Ming) Song song Word making
Scarlet red Song song Composition of music
Scarlet red Song song Singing of
TABLE 1
In step S102 of some embodiments, an original entity-relationship recognition model including an encoder, a decoder, a relationship prediction layer, and an entity extraction layer is preset, and a sample recognition text is used as input data of the original entity-relationship recognition model.
In step S103 of some embodiments, the sample recognition text is used as input data of an encoder, and each character in the sample recognition text is extracted by the encoder to obtain a corresponding sample text vector.
Referring to fig. 2, in some embodiments, the sample text vector includes an entity head position vector and an entity tail position vector. Step S103 includes, but is not limited to, including step S201 to step S203.
Step S201, extracting features of a sample recognition text according to an encoder to obtain a preliminary text vector;
step S202, extracting the entity first position feature of the preliminary text vector according to a preset first full-connection layer to obtain an entity first position vector;
and step S203, extracting the entity tail position features of the preliminary text vector according to a preset second full-connection layer to obtain the entity tail position vector.
In step S201 of some embodiments, the text x= [ X ] is identified for the samples according to the encoder 1 ,x 2 ,...,x n ]Feature extraction is carried out to obtain a set H= [ H ] of preliminary text vectors 1 ,h 2 ,...,h n ]. Wherein n represents the sentence length of the sample recognition text, h i Representing the ith character x in the sample recognition text i And the corresponding preliminary text vector i is less than or equal to n.
In step S202 of some embodiments, the first full-join layer is configured to map the preliminary text vector extracted by the encoder to a high-dimensional space and mix information to extract an entity first position vector having entity first position information. It is understood that the first position of an entity refers to a position corresponding to the first character of the entity. Specifically, a preliminary text vector h is calculated according to the following formula (1) i Is the entity first position vector of (2)
Figure BDA0004181317710000091
Figure BDA0004181317710000092
Wherein W is L Is a parameter matrix of the first full connection layer. From equation (1), a set of corresponding preliminary text vectors h= [ H ] can be obtained 1 ,h 2 ,...,h n ]Is set of entity head position vectors
Figure BDA0004181317710000093
In step S203 of some embodiments, the second fully-connected layer is used to map the preliminary text vector extracted by the encoder to another high-dimensional spaceAnd mixing the information to extract the entity tail position vector with the entity tail position information. It is understood that the entity tail position refers to a position corresponding to the tail character of the entity. Specifically, the preliminary text vector h is calculated according to the following formula (2) i Is the entity tail position vector of (1)
Figure BDA0004181317710000094
Figure BDA0004181317710000095
Wherein W is R Is a parameter matrix of the second full connection layer. From equation (2), a set of corresponding preliminary text vectors h= [ H ] can be obtained 1 ,h 2 ,...,h n ]Is set of entity tail position vectors
Figure BDA0004181317710000096
In step S104 of some embodiments, the header entity flag is a preset template flag for marking the location of the header entity. And taking the head entity mark as input data of a decoder, and extracting the characteristics of the head entity mark through the decoder to obtain a head mark vector. Taking the HEAD entity tag as < HEAD > as an example, the HEAD entity tag input decoder is denoted as input= [ < HEAD >]Obtaining a header mark vector h output by the decoder head
In step S105 of some embodiments, the header tag vector and the sample text vector are used as input data of the entity extraction layer, and the entity extraction layer determines the distribution probability of the header tag vector in the sample text vector, so as to determine which vector is most likely to be the vector corresponding to the prediction header entity according to the distribution probability, thereby obtaining the prediction header entity.
Referring to fig. 3, in some embodiments, step S105 includes, but is not limited to including, step S301 through step S303.
Step S301, performing attention calculation on the entity head position vector according to the head mark vector to obtain a first prediction head position vector;
Step S302, performing attention calculation on the entity tail position vector according to the head mark vector to obtain a first predicted tail position vector;
step S303, performing entity extraction on the preliminary text vector according to the entity extraction layer, the first prediction head position vector and the first prediction tail position vector to obtain a prediction head entity.
In step S301 of some embodiments, attention is calculated on the entity first position vector according to the header mark vector, so as to determine a distribution probability of the header mark vector in the entity first position vector, and the entity first position vector corresponding to the maximum probability value is used as the first prediction first position vector. That is, the similarity between the head position vector and the head label vector of each entity is determined through attention calculation, and the higher the similarity is, the larger the corresponding probability value is. The first predicted first position vector is obtained according to the similarity, and therefore the first predicted first position vector is the position vector which is most likely to be the first entity first character in the entity first position vector. Specifically, the header label vector h head As query vectors, a set of entity head position vectors
Figure BDA0004181317710000101
As the Key vector (Key) and the value vector (value), the distribution probability P is calculated according to the following formulas (3) to (7) L
Figure BDA0004181317710000102
A=soft max([s1 1 ,s1 2 ,...,s1 n ])=[a 1 ,a 2 ,...a n ]... (4)
Figure BDA0004181317710000103
Figure BDA0004181317710000104
P L =soft max([z1 1 ,z1 2 ,...z1 n ])=[P1 1 ,P1 2 ,...P1 n ]... (7)
Probability of distribution P L Performing argmax operation to determine distribution probability P L The entity first position vector corresponding to the maximum probability value. For example, if the entity first position vector corresponding to the maximum probability value is
Figure BDA0004181317710000105
Then specify vector->
Figure BDA0004181317710000106
The corresponding character is most likely the first character of the predictive head entity.
In step S302 of some embodiments, attention is calculated on the entity tail position vector according to the head marker vector, so as to determine a distribution probability of the head marker vector in the entity tail position vector, and the entity tail position vector corresponding to the maximum probability value is used as the first prediction tail position vector. That is, the similarity between each entity tail position vector and the head label vector is determined through attention calculation, and the higher the similarity is, the larger the corresponding probability value is. The first predicted tail position vector is derived from the similarity, and is therefore the most likely position vector of the entity tail position vector to be the head entity tail character. Specifically, the header label vector h head As query vectors, a set of entity tail position vectors
Figure BDA0004181317710000107
As the Key vector (Key) and the value vector (value), the distribution probability P is calculated according to the following formulas (8) to (12) R
Figure BDA0004181317710000108
B=soft max([s2 1 ,s2 2 ,...,s2 n ])=[b 1 ,b 2 ,...b n ]... (9)
Figure BDA0004181317710000109
Figure BDA00041813177100001010
P R =soft max([z2 1 ,z2 2 ,...z2 n ])=[P2 1 ,P2 2 ,...P2 n ]... (12)
Probability of distribution P R Performing argmax operation to determine distribution probability P R The entity tail position vector corresponding to the maximum probability value. For example, if the entity tail position vector corresponding to the maximum probability value is
Figure BDA00041813177100001011
Then specify vector->
Figure BDA00041813177100001012
The corresponding character is most likely the tail character of the predictive head entity.
In step S303 of some embodiments, the first predicted first position vector and the first predicted last position vector are used as input data of the entity extraction layer, so as to select a corresponding entity from the preliminary text vectors according to the first predicted first position vector and the first predicted last position vector, and the selected entity is used as a predicted head entity. Taking sample recognition text as 'small open word, big red song and singing song' as an example, when the first predicted first position vector is
Figure BDA00041813177100001013
When the preliminary text vector h is indicated 1 The corresponding character (i.e. "small") is the first character of the predictive head entity; when the first predicted tail position vector is +.>
Figure BDA0004181317710000111
When the preliminary text vector h is indicated 2 The corresponding character (i.e. "bright") isThe tail characters of the head entity are predicted, so that the head entity can be predicted to be 'small bright'.
The advantage of steps S301 to S303 is that the character corresponding to the vector with the highest similarity is selected from the entity first position vectors according to the header label vector as the first character of the prediction head entity, and the character corresponding to the vector with the highest similarity is selected from the entity last position vectors according to the header label vector as the last character of the prediction head entity, so that the prediction head entity is determined according to the first character and the last character. It can be seen that, in the method for determining the predicted head entity according to the embodiment of the present application, the selection is actually performed between n entity head position vectors and n entity tail position vectors, so that the search space is reduced compared with the method for directly predicting the entity in the related art.
In step S106 of some embodiments, the tail entity tag is a preset template identifier for marking the location of the tail entity. And splicing the head entity mark, the predicted head entity and the tail entity mark in sequence, and taking the spliced data as input data of a decoder, so that the input data is subjected to feature extraction through the decoder, and a tail mark vector with predicted head entity information is obtained. Taking the predicted HEAD entity as the min, the TAIL entity labeled < TAIL > as an example, the input decoder is labeled input= [ < HEAD >, min, < TAIL >]Obtaining the tail mark vector h output by the decoder tail
In step S107 of some embodiments, the tail flag vector and the sample text vector are used as input data of the entity extraction layer, and the entity extraction layer determines the distribution probability of the tail flag vector in the sample text vector, so as to determine which vector is most likely to be the vector of the prediction tail entity corresponding to the prediction head entity according to the distribution probability, thereby obtaining the prediction tail entity. As can be seen from step S104 to step S107, in the embodiment of the present application, a vector in the sample text vector is selected according to the probability distribution corresponding to the header label vector, so as to obtain a header entity; and selecting the vector in the sample text vector according to the probability distribution corresponding to the tail mark vector to obtain a tail entity. Therefore, the search space required in the related art for directly predicting the entity can be reduced from the word list size V to the sentence length n, where the sentence length refers to the length of the sample recognition text and n refers to the number of characters in the sample recognition text.
Referring to fig. 4, in some embodiments, step S107 includes, but is not limited to including, step S401 through step S403.
Step S401, performing attention calculation on the entity head position vector according to the tail mark vector to obtain a second prediction head position vector;
step S402, performing attention calculation on the entity tail position vector according to the tail mark vector to obtain a second predicted tail position vector;
and step S403, performing entity extraction on the preliminary text vector according to the entity extraction layer, the second prediction head position vector and the second prediction tail position vector to obtain a prediction tail entity.
In step S401 of some embodiments, attention is calculated on the entity first position vector according to the tail flag vector, so as to determine a distribution probability of the tail flag vector in the entity first position vector, and the entity first position vector corresponding to the maximum probability value is used as the second prediction first position vector. That is, the similarity of the head position vector and the tail marker vector of each entity is determined through attention calculation, and the higher the similarity is, the larger the corresponding probability value is. And obtaining a second prediction head position vector according to the similarity, wherein the second prediction head position vector is the position vector which is most likely to become a tail entity head character in the entity head position vector. Specifically, the tail flag vector h tail As query vectors, a set of entity head position vectors
Figure BDA0004181317710000121
As the Key vector (Key) and the value vector (value), the distribution probability Q is calculated according to the following formulas (13) to (17) L
Figure BDA0004181317710000122
C=soft max([s3 1 ,s3 2 ,...,s3 n ])=[c 1 ,c 2 ,...c n ]... (14)
Figure BDA0004181317710000123
Figure BDA0004181317710000124
Q L =soft max([z3 1 ,z3 2 ,...z3 n ])=[Q1 1 ,Q1 2 ,...Q1 n ]... (17)
Probability of distribution Q L Performing argmax operation to determine distribution probability Q L The entity first position vector corresponding to the maximum probability value. For example, if the entity first position vector corresponding to the maximum probability value is
Figure BDA0004181317710000125
Then specify vector->
Figure BDA0004181317710000126
The corresponding character is most likely the first character of the predicted tail entity.
In step S402 of some embodiments, attention is calculated on the entity tail position vector according to the tail marker vector, so as to determine a distribution probability of the tail marker vector in the entity tail position vector, and the entity tail position vector corresponding to the maximum probability value is used as a second predicted tail position vector. That is, the similarity of each entity tail position vector and the tail marker vector is determined through attention calculation, and the higher the similarity is, the larger the corresponding probability value is. The second predicted tail position vector is obtained according to the similarity, and therefore the second predicted tail position vector is the position vector which is most likely to be the tail entity tail character in the entity tail position vector. Specifically, the header label vector h tail As query vectors, a set of entity tail position vectors
Figure BDA0004181317710000127
As the Key vector (Key) and the value vector (value), a score is calculated according to the following formulas (18) to (22)Cloth probability Q R
Figure BDA0004181317710000128
D=soft max([s4 1 ,s4 2 ,...,s4 n ])=[d 1 ,d 2 ,...d n ]... (19)
Figure BDA0004181317710000129
Figure BDA00041813177100001210
Q R =soft max([z4 1 ,z4 2 ,...z4 n ])=[P4 1 ,P4 2 ,...P4 n ]... (22)
Probability of distribution Q R Performing argmax operation to determine distribution probability Q R The entity tail position vector corresponding to the maximum probability value. For example, if the entity tail position vector corresponding to the maximum probability value is
Figure BDA00041813177100001211
Then specify vector->
Figure BDA00041813177100001212
The corresponding character is most likely the tail character of the predicted tail entity.
In step S403 of some embodiments, the second predicted first position vector and the second predicted last position vector are used as input data of the entity extraction layer, so as to select a corresponding entity from the preliminary text vectors according to the second predicted first position vector and the second predicted last position vector, and the selected entity is used as a predicted last entity. Taking sample recognition text as 'small open word, big red song and singing song' as an example, when the second predicted first position vector is
Figure BDA00041813177100001213
When the preliminary text vector h is indicated 13 The corresponding character (i.e. "song") is the first character of the predicted tail entity; when the second predicted tail position vector is +.>
Figure BDA00041813177100001214
When the preliminary text vector h is indicated 14 The corresponding character (i.e. "song") is the tail character of the predicted tail entity, and thus the predicted tail entity may be obtained as a "song".
The advantage of steps S401 to S403 is that the character corresponding to the vector with the highest similarity is selected from the entity head position vectors according to the tail marker vector as the head character of the predicted tail entity, and the character corresponding to the vector with the highest similarity is selected from the entity tail position vectors according to the tail marker vector as the tail character of the predicted tail entity, so that the predicted tail entity is determined according to the head character and the tail character. It can be seen that, in the method for determining the predicted tail entity according to the embodiment of the present application, the selection is actually performed between n entity head position vectors and n entity tail position vectors, so that the search space is reduced compared with the method for directly predicting the entity in the related art.
In step S108 of some embodiments, the entity relationship of the prediction head entity and the prediction tail entity is predicted by the relationship prediction layer, so as to obtain a predicted entity relationship.
Referring to fig. 5, in some embodiments, step S108 includes, but is not limited to including, step S501 through step S502.
Step S501, extracting features of a preset entity relation mark, a predicted tail entity, a tail entity mark, a predicted head entity and a head entity mark according to a decoder to obtain a relation mark vector;
and step S502, extracting entity relations from the relation mark vectors according to the relation prediction layer to obtain predicted entity relations.
In step S501 of some embodiments, the entity relationship mark is a preset template mark for marking entity relationships. Sequentially splicing the head entity mark, the predicted head entity, the tail entity mark, the predicted tail entity and the entity relation mark, and splicing to obtainThe data is used as input data of a decoder, and the input data is subjected to feature extraction by the decoder to obtain a relation mark vector with prediction head entity information and prediction tail entity information. Taking the prediction HEAD entity as Ming and the prediction TAIL entity as scarlet and the entity relation mark as < RELA > as an example, the input decoder is marked as input= [ < HEAD > Ming < TAIL > song < RELA >]Obtaining a relation mark vector h output by the decoder rela
In step S502 of some embodiments, a relationship between pairs of entities is predicted by a relationship prediction layer using the relationship flag vector as input data of the relationship prediction layer, to obtain a predicted entity relationship. The description of the example in step S501 is still given, in which the entity pairs refer to the predicted first entity information and the predicted last entity information implied by the relation tag vector, and the entity pairs are the predicted first entity "Xiaoming" and the predicted last entity "song". Thus, the resulting predicted entity relationship is a relationship that characterizes "mins" and "songs", e.g., the predicted entity relationship is a "word". Specifically, the predicted entity relationship y is calculated according to the following formula (23) rela
y rela =soft max(W c *h rela ) ... (23)
Wherein W is c Is a parameter matrix of the relational prediction layer.
In step S109 of some embodiments, an entity prediction loss is determined from the sample header entity and the prediction header entity, and an entity relationship prediction loss is determined from the predicted entity relationship and the sample entity relationship. And carrying out parameter adjustment on the original entity relationship identification model according to the two losses to obtain a target entity relationship identification model with higher prediction accuracy.
Referring to fig. 6, in some embodiments, step S109 includes, but is not limited to including, step S601 through step S604.
Step S601, performing first position loss calculation according to a first predicted first position vector and a preset sample first position vector to obtain first position loss;
step S602, performing tail position loss calculation according to a first predicted tail position vector and a preset sample tail position vector to obtain tail position loss;
step S603, calculating entity relation loss according to the sample entity relation and the predicted entity relation to obtain relation loss;
and step S604, carrying out parameter adjustment on the original entity relationship recognition model according to the head position loss, the tail position loss and the relationship loss to obtain a target entity relationship recognition model.
It should be noted that, as can be seen from steps S301 to S303 and steps S401 to S403, the entity determination (including the prediction head entity determination and the prediction tail entity determination) is substantially determined by the entity head position (i.e. the position corresponding to the head character) and the entity tail position (i.e. the position corresponding to the tail character), so the entity prediction loss described in step S109 includes the head position loss and the tail position loss.
In step S601 of some embodiments, the sample head position vector is pre-labeled for representing the preliminary text vector h i Is the entity first position vector of (2)
Figure BDA0004181317710000141
Whether it is the head position vector of the head entity. Still taking the head entity as "Ming" as an example, in the set of preliminary text vectors +.>
Figure BDA0004181317710000142
In vector->
Figure BDA0004181317710000143
The corresponding entity head position vector is 1, and the entity head position vectors corresponding to the rest preliminary text vectors are all 0. Wherein "1" represents a preliminary text vector +.>
Figure BDA0004181317710000144
The corresponding character (i.e., "small") is the character of the head entity head position, and "0" indicates that none of the characters corresponding to the remaining preliminary text vectors is the character of the head entity head position. It will be appreciated that "0" and "1" are exemplary only, and may be represented in other ways. Therefore, it can be calculated according to the following formula (24) Loss of head position loss L
loss L =CR(P L ,y L ) ... (24)
Wherein CR () represents a cross entropy loss function, y L Representing the first position vector of the sample, P L Is used to determine the probability of the distribution of the first predicted head position vector.
In step S602 of some embodiments, the sample tail position vector is pre-labeled for representing the preliminary text vector h i Is the entity tail position vector of (1)
Figure BDA0004181317710000145
Whether it is a tail position vector of the head entity. Still taking the head entity as "Ming" as an example, then in the set of preliminary text vectors +.>
Figure BDA0004181317710000146
In vector->
Figure BDA0004181317710000147
The corresponding entity tail position vector is 1, and the entity tail position vectors corresponding to the rest preliminary text vectors are all 0. Wherein "1" represents a preliminary text vector +.>
Figure BDA0004181317710000148
The corresponding character (i.e., "bright") is the character at the end position of the head entity, and "0" indicates that none of the characters corresponding to the remaining preliminary text vectors is the character at the end position of the head entity. It will be appreciated that "0" and "1" are exemplary only, and may be represented in other ways. Therefore, the tail position loss can be calculated according to the following equation (25) R
loss R =CR(P R ,y R ) ... (25)
Wherein y is R Representing the tail position vector of the sample, P R Is used to determine the distribution probability of the first predicted tail position vector.
In step S603 of some embodiments, a relationship loss is calculated according to the following equation (26).
loss multi =CR(y rela Y')
Where y' represents a sample entity relationship.
In step S604 of some embodiments, the head position loss, the tail position loss, and the relationship loss are summed, and the original entity relationship recognition model is subjected to parameter adjustment according to the calculated sum value, so as to obtain the target entity relationship recognition model.
It will be appreciated that the head position penalty may also be calculated from the second predicted head position vector and the tail position penalty may be calculated from the second predicted tail position vector. Or determining head entity prediction loss according to the head entity prediction and the sample head entity, determining tail entity prediction loss according to the tail entity prediction and the sample tail entity prediction, and carrying out parameter adjustment on the original entity relationship identification model according to the head entity prediction loss, the tail entity prediction loss and the relationship loss. As can be seen from this, the method for calculating the loss value for parameter adjustment in the embodiment of the present application is not particularly limited. The benefit of steps S601 to S604 is that, since the head entity and the tail entity are both selected according to the head position and the tail position, the head entity prediction loss and the tail entity prediction loss can be simultaneously represented according to the head position loss and the tail position loss, thereby reducing the calculation amount of loss calculation.
Referring to fig. 7A to 7B, in some embodiments, before step S110, the entity relationship identifying method provided in the embodiments of the present application further includes, but is not limited to, steps S701 to S702.
Step S701, splicing the head entity mark, the predicted head entity, the tail entity mark, the predicted tail entity, the entity relation mark and the predicted entity relation to obtain initial relation data;
step S702, if the initial relationship data is inconsistent with the preset relationship data, performing the preset relationship identification operation in a circulating manner until the updated initial relationship data is consistent with the preset relationship data.
The relationship identifying operation includes, but is not limited to, including step S7021 to step S7028.
Step S7021, splicing the head entity mark and the initial relation data to obtain preliminary prediction data;
step S7022, extracting features of the preliminary prediction data according to a decoder to obtain a preliminary head vector;
step S7023, performing entity extraction on the sample text vector according to the entity extraction layer and the preliminary head vector to obtain a preliminary head entity;
step S7024, extracting features of the tail entity marks, the preliminary head entities and the preliminary prediction data according to the decoder to obtain preliminary tail vectors;
Step S7025, performing entity extraction on the sample text vector according to the entity extraction layer and the preliminary tail vector to obtain a preliminary tail entity;
step S7026, extracting features of entity relation marks, preliminary tail entities, tail entity marks, preliminary head entities and preliminary prediction data according to a decoder to obtain preliminary relation vectors;
step S7027, extracting entity relations from the preliminary relation vectors according to the relation prediction layer to obtain preliminary entity relations;
step S7028, updating the initial relationship data according to the initial entity relationship, the entity relationship flag, the initial tail entity, the tail entity flag, the initial head entity, and the head entity flag.
It should be noted that in steps S101 to S109, only how one entity pair (including a predicted head entity and a predicted tail entity) and the entity relationship (i.e., predicted entity relationship) of the entity pair are determined from the sample recognition text are shown. However, in some embodiments, the sample identification text may also include other entity pairs, or the entity pairs may also have other entity relationships, so that the relationship identification operation needs to be circulated to determine all entity pairs in the sample identification text, and all entity relationships corresponding to each entity pair. The relationship-identifying operation of the loop will be described in detail below.
In step S701 of some embodiments, still described by way of example in steps S101 to S109, the HEAD entity tag, the predicted HEAD entity, the TAIL entity tag, the predicted TAIL entity, the entity relationship tag, and the predicted entity relationship are sequentially spliced to obtain initial relationship data [ < HEAD > small < TAIL > song < RELA > phrase ].
In step S702 of some embodiments, the preset relationship data is preset according to the sample identification text, and includes all entity pairs, and data of all entity relationships corresponding to each entity pair. Such as may be the data shown in table (1). When the initial relation data is inconsistent with the preset relation data, the fact that unrecognized entity pairs or unrecognized entity relations exist in the sample identification text is indicated, and therefore relation identification operation is needed.
The relationship identification operation is specifically as follows:
in step S7021 of some embodiments, the initial relationship data and the HEAD entity markers are sequentially spliced to obtain preliminary prediction data, e.g., preliminary prediction data [ < HEAD > Ming < TAIL > song < RELA > phrase < HEAD > ].
In step S7022 of some embodiments, the preliminary prediction data is used as input data of a decoder to obtain a preliminary head vector, e.g., to obtain a preliminary head vector t, by the decoder head . It will be appreciated that the specific method for deriving the preliminary HEAD vector from the decoder is similar to step S104, except that the input to the decoder is input= [ < HEAD > Ming < TAIL > song < RELA > word < HEAD >]The output of the decoder is a preliminary header vector t with the pair of entity pair information of ' Xiaoming ', ' Song head
In step S7023 of some embodiments, similar to the operation of step S105, the initial head vector t is calculated head Other head entities in the sample identification text are determined, for example, a preliminary head entity "scarlet" is determined.
In step S7024 of some embodiments, the preliminary prediction data, the preliminary HEAD entity, and the TAIL entity marks are spliced in order, and the spliced data is used as input data of a decoder, e.g., input data of the decoder is recorded as input= [ < HEAD > small < TAIL > song < RELA > word < HEAD > scarlet < TAIL >]. Similar to the operation of step S106, a preliminary tail vector t can be obtained tail
In step S7025 of some embodiments, similar to the operation of step S107, the preliminary tail vector t is used tail And determining a preliminary tail entity corresponding to the preliminary head entity in the sample identification text, and if so, obtaining a preliminary tail entity song.
In step S7026 of some embodiments, the preliminary prediction data, the preliminary HEAD entity, the TAIL entity tag, the preliminary TAIL entity and the entity relationship tag are sequentially spliced, and the spliced data is used as input data of a decoder, e.g. the input data of the decoder is recorded as input= [ < HEAD > Xiaoming < TAIL > song < RELA > phrase < HEAD > scarlet < TAIL > song < RELA >]. Similar to the operation of step S501, a preliminary relationship vector t can be obtained rela
In step S7027 of some embodiments, similar to the operation of step S502, the preliminary relationship vector t is used rela And determining the relation of the preliminary entities corresponding to the preliminary head entity and the preliminary tail entity in the sample identification text, and if the relation of the preliminary entities is determined to be "composition".
In step S7028 of some embodiments, the preliminary prediction data, the preliminary head entity, the tail entity flag, the preliminary tail entity, the entity relationship flag, and the preliminary entity relationship are sequentially spliced, so as to update the initial relationship data. For example, in this relationship identification operation, the updated preliminary relationship data is [ < HEAD > Xiaoming < TAIL > Song < RELA > composition < HEAD > scarlet < TAIL > Song < RELA > composition ].
It can be understood that if the updated preliminary relationship data is inconsistent with the preset relationship data, the relationship identifying operation is performed again. Second, when only a plurality of entity relationships of a pair of entities need to be identified, only the entity relationship label < RELA > may be stitched. For example, to determine the entity relationship between the entity pair "scarlet" and "song" other than "composition", only the entity relationship label < RELA > -may be spliced in the updated preliminary relationship data to obtain spliced data [ < HEAD > Ming < TAIL > song < RELA > -composition word < HEAD > scarlet < TAIL > song < RELA > composition < RELA > ]. Step S7025 and step S7026 are performed again according to the spliced data, thereby identifying another entity relationship "singing" of the entity pair "scarlet" and "song".
The benefit of steps S701 to S702 is that all entity pairs in the sample recognition text, and all entity relationships contained by each entity pair, can be determined according to the loop execution relationship recognition operation. When the parameter adjustment is carried out on the original entity relationship recognition model according to all entity pairs and all entity relationships contained in each entity pair, a target entity relationship recognition model with more comprehensive entity pair recognition and entity relationship recognition can be obtained.
In step S110 of some embodiments, the target recognition text is text that requires entity recognition and entity relationship recognition. And taking the target recognition text as input data of a target entity relationship recognition model, and recognizing the target entity relationship recognition model to obtain entity relationships of the head entity, the tail entity, the head entity and the tail entity in the target recognition text. And performing downstream tasks such as knowledge maps, question-answering systems and the like related to the NLP according to the identified head entity, tail entity and entity relation. When the present application is applied to a terminal, the terminal may be loaded with an application program for executing the entity relationship recognition method, and the application program may include an input window, and the target recognition text may be acquired by responding to an operation of pasting the text in the input window. Or, a file uploading control is arranged in the application program, a file with a target identification text is obtained by responding to the operation of triggering the file uploading control, and the target identification text is obtained by performing operations such as optical character recognition (Optical Character Recognition, OCR) and the like on the file. Therefore, it can be seen that the method for acquiring the target recognition text in the embodiment of the present application is not particularly limited.
Referring to fig. 8, an embodiment of the present application further provides an entity relationship identifying apparatus, which may implement the above entity relationship identifying method, where the apparatus includes:
a text obtaining module 801, configured to obtain a sample identification text, where the sample identification text includes a sample head entity, a sample tail entity, and a sample entity relationship between the sample head entity and the sample tail entity;
the data input module 802 is configured to input a sample recognition text to a preset original entity relationship recognition model, where the original entity relationship recognition model includes an encoder, a decoder, a relationship prediction layer, and an entity extraction layer;
a first feature extraction module 803, configured to perform feature extraction on the sample recognition text according to the encoder, and sample text vectors;
a second feature extraction module 804, configured to perform feature extraction on a preset header entity tag according to a decoder, so as to obtain a header tag vector;
the first entity extraction module 805 is configured to perform entity extraction on the sample text vector according to the entity extraction layer and the header label vector, so as to obtain a predicted header entity;
a third feature extraction module 806, configured to perform feature extraction on a preset tail entity tag, a predicted head entity, and a head entity tag according to the decoder, to obtain a tail tag vector;
A second entity extraction module 807, configured to perform entity extraction on the sample text vector according to the entity extraction layer and the tail label vector, to obtain a predicted tail entity;
the entity relation extracting module 808 is configured to extract the entity relation between the predicted head entity and the predicted tail entity according to the relation predicting layer, so as to obtain a predicted entity relation;
the parameter adjustment module 809 is configured to perform parameter adjustment on the original entity relationship identification model according to the sample head entity, the sample entity relationship, the predicted head entity, and the predicted entity relationship to obtain a target entity relationship identification model;
the entity relationship recognition module 810 is configured to perform entity relationship recognition on the obtained target recognition text according to the target entity relationship recognition model.
The specific implementation manner of the entity relationship recognition device is basically the same as that of the specific embodiment of the entity relationship recognition method, and is not described herein again.
The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the entity relationship identification method when executing the computer program. The electronic equipment can be any intelligent terminal including a tablet personal computer, a vehicle-mounted computer and the like.
Referring to fig. 9, fig. 9 illustrates a hardware structure of an electronic device of another embodiment, the electronic device including:
the processor 901 may be implemented by a general purpose CPU (central processing unit), a microprocessor, an application specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solutions provided by the embodiments of the present application;
the memory 902 may be implemented in the form of read-only memory (ReadOnlyMemory, ROM), static storage, dynamic storage, or random access memory (RandomAccessMemory, RAM). The memory 902 may store an operating system and other application programs, and when the technical solutions provided in the embodiments of the present application are implemented by software or firmware, relevant program codes are stored in the memory 902, and the processor 901 invokes the entity relationship identification method to execute the embodiments of the present application;
an input/output interface 903 for inputting and outputting information;
the communication interface 904 is configured to implement communication interaction between the device and other devices, and may implement communication in a wired manner (e.g. USB, network cable, etc.), or may implement communication in a wireless manner (e.g. mobile network, WIFI, bluetooth, etc.);
A bus 905 that transfers information between the various components of the device (e.g., the processor 901, the memory 902, the input/output interface 903, and the communication interface 904);
wherein the processor 901, the memory 902, the input/output interface 903 and the communication interface 904 are communicatively coupled to each other within the device via a bus 905.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the entity relationship identification method when being executed by a processor.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and as those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
It will be appreciated by those skilled in the art that the technical solutions shown in the figures do not constitute limitations of the embodiments of the present application, and may include more or fewer steps than shown, or may combine certain steps, or different steps.
The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the present application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in this application, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is merely a logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the various embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing a program.
Preferred embodiments of the present application are described above with reference to the accompanying drawings, and thus do not limit the scope of the claims of the embodiments of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present application shall fall within the scope of the claims of the embodiments of the present application.

Claims (10)

1. A method for identifying entity relationships, the method comprising:
acquiring a sample identification text, wherein the sample identification text comprises a sample head entity, a sample tail entity and a sample entity relation between the sample head entity and the sample tail entity;
inputting the sample recognition text into a preset original entity relationship recognition model, wherein the original entity relationship recognition model comprises an encoder, a decoder, a relationship prediction layer and an entity extraction layer;
extracting features of the sample recognition text according to the encoder to obtain a sample text vector;
extracting features of a preset head entity mark according to the decoder to obtain a head mark vector;
performing entity extraction on the sample text vector according to the entity extraction layer and the header mark vector to obtain a prediction header entity;
performing feature extraction on a preset tail entity mark, the predicted head entity and the head entity mark according to the decoder to obtain a tail mark vector;
performing entity extraction on the sample text vector according to the entity extraction layer and the tail mark vector to obtain a predicted tail entity;
extracting entity relations from the prediction head entity and the prediction tail entity according to the relation prediction layer to obtain a predicted entity relation;
Parameter adjustment is carried out on the original entity relationship recognition model according to the sample head entity, the sample entity relationship, the prediction head entity and the prediction entity relationship to obtain a target entity relationship recognition model;
and carrying out entity relationship recognition on the obtained target recognition text according to the target entity relationship recognition model.
2. The method of claim 1, wherein the sample text vector comprises an entity head position vector and an entity tail position vector;
the feature extraction is carried out on the sample recognition text according to the encoder, and sample text vectors comprise:
extracting features of the sample recognition text according to the encoder to obtain a preliminary text vector;
extracting entity first position features of the preliminary text vector according to a preset first full-connection layer to obtain the entity first position vector;
and extracting the entity tail position features of the preliminary text vector according to a preset second full-connection layer to obtain the entity tail position vector.
3. The method according to claim 2, wherein the performing entity extraction on the sample text vector according to the entity extraction layer and the header label vector to obtain a predicted header entity comprises:
Performing attention calculation on the entity head position vector according to the head mark vector to obtain a first prediction head position vector;
performing attention calculation on the entity tail position vector according to the head mark vector to obtain a first predicted tail position vector;
and carrying out entity extraction on the preliminary text vector according to the entity extraction layer, the first prediction head position vector and the first prediction tail position vector to obtain the prediction head entity.
4. A method according to claim 3, wherein said entity extracting said sample text vector according to said entity extraction layer and said tail flag vector to obtain a predicted tail entity comprises:
performing attention calculation on the entity head position vector according to the tail mark vector to obtain a second prediction head position vector;
performing attention calculation on the entity tail position vector according to the tail mark vector to obtain a second predicted tail position vector;
and carrying out entity extraction on the preliminary text vector according to the entity extraction layer, the second prediction head position vector and the second prediction tail position vector to obtain the prediction tail entity.
5. A method according to claim 3, wherein the extracting the entity relationship between the prediction head entity and the prediction tail entity according to the relationship prediction layer to obtain a predicted entity relationship comprises:
Performing feature extraction on a preset entity relation mark, the predicted tail entity, the tail entity mark, the predicted head entity and the head entity mark according to the decoder to obtain a relation mark vector;
and extracting the entity relationship from the relationship marking vector according to the relationship prediction layer to obtain the predicted entity relationship.
6. The method of claim 5, wherein performing parameter adjustment on the original entity relationship recognition model according to the sample head entity, the sample entity relationship, the predicted head entity, and the predicted entity relationship to obtain a target entity relationship recognition model comprises:
performing first position loss calculation according to the first predicted first position vector and a preset sample first position vector to obtain first position loss;
performing tail position loss calculation according to the first predicted tail position vector and a preset sample tail position vector to obtain tail position loss;
calculating entity relation loss according to the sample entity relation and the predicted entity relation to obtain relation loss;
and carrying out parameter adjustment on the original entity relationship recognition model according to the head position loss, the tail position loss and the relationship loss to obtain the target entity relationship recognition model.
7. The method of claim 5, wherein prior to the entity-relationship recognition of the obtained target recognition text according to the target entity-relationship recognition model, the method further comprises:
splicing the head entity mark, the prediction head entity, the tail entity mark, the prediction tail entity, the entity relation mark and the prediction entity relation to obtain initial relation data;
if the initial relation data is inconsistent with the preset relation data, circularly executing preset relation identification operation until the updated initial relation data is consistent with the preset relation data;
wherein the relationship identification operation includes:
splicing the head entity mark with the initial relation data to obtain preliminary prediction data;
extracting features of the preliminary prediction data according to the decoder to obtain a preliminary head vector;
performing entity extraction on the sample text vector according to the entity extraction layer and the preliminary head vector to obtain a preliminary head entity;
performing feature extraction on the tail entity mark, the preliminary head entity and the preliminary prediction data according to the decoder to obtain a preliminary tail vector;
Performing entity extraction on the sample text vector according to the entity extraction layer and the preliminary tail vector to obtain a preliminary tail entity;
performing feature extraction on the entity relation mark, the preliminary tail entity, the tail entity mark, the preliminary head entity and the preliminary prediction data according to the decoder to obtain a preliminary relation vector;
extracting entity relations from the preliminary relation vector according to the relation prediction layer to obtain a preliminary entity relation;
and updating the initial relation data according to the initial entity relation, the entity relation mark, the initial tail entity, the tail entity mark, the initial head entity and the head entity mark.
8. An entity relationship identification apparatus, the apparatus comprising:
the system comprises a text acquisition module, a text recognition module and a text recognition module, wherein the text recognition module is used for acquiring a sample recognition text, the sample recognition text comprises a sample head entity, a sample tail entity and a sample entity relation between the sample head entity and the sample tail entity;
the data input module is used for inputting the sample recognition text into a preset original entity relationship recognition model, wherein the original entity relationship recognition model comprises an encoder, a decoder, a relationship prediction layer and an entity extraction layer;
The first feature extraction module is used for carrying out feature extraction on the sample identification text according to the encoder, and a sample text vector;
the second feature extraction module is used for carrying out feature extraction on a preset head entity mark according to the decoder to obtain a head mark vector;
the first entity extraction module is used for carrying out entity extraction on the sample text vector according to the entity extraction layer and the header mark vector to obtain a prediction header entity;
the third feature extraction module is used for carrying out feature extraction on the preset tail entity mark, the predicted head entity and the head entity mark according to the decoder to obtain a tail mark vector;
the second entity extraction module is used for carrying out entity extraction on the sample text vector according to the entity extraction layer and the tail mark vector to obtain a predicted tail entity;
the entity relation extraction module is used for extracting entity relation of the prediction head entity and the prediction tail entity according to the relation prediction layer to obtain a predicted entity relation;
the parameter adjustment module is used for carrying out parameter adjustment on the original entity relationship identification model according to the sample head entity, the sample entity relationship, the prediction head entity and the prediction entity relationship to obtain a target entity relationship identification model;
And the entity relationship recognition module is used for carrying out entity relationship recognition on the obtained target recognition text according to the target entity relationship recognition model.
9. An electronic device comprising a memory storing a computer program and a processor implementing the method of any of claims 1 to 7 when the computer program is executed by the processor.
10. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the method of any one of claims 1 to 7.
CN202310405543.0A 2023-04-12 2023-04-12 Entity relationship identification method and device, electronic equipment and storage medium Pending CN116362252A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310405543.0A CN116362252A (en) 2023-04-12 2023-04-12 Entity relationship identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310405543.0A CN116362252A (en) 2023-04-12 2023-04-12 Entity relationship identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116362252A true CN116362252A (en) 2023-06-30

Family

ID=86909296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310405543.0A Pending CN116362252A (en) 2023-04-12 2023-04-12 Entity relationship identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116362252A (en)

Similar Documents

Publication Publication Date Title
CN113792818B (en) Intention classification method and device, electronic equipment and computer readable storage medium
CN110795543B (en) Unstructured data extraction method, device and storage medium based on deep learning
CN109388795B (en) Named entity recognition method, language recognition method and system
CN107798140B (en) Dialog system construction method, semantic controlled response method and device
CN110688489B (en) Knowledge graph deduction method and device based on interactive attention and storage medium
CN114722069A (en) Language conversion method and device, electronic equipment and storage medium
CN116561538A (en) Question-answer scoring method, question-answer scoring device, electronic equipment and storage medium
CN112188311B (en) Method and apparatus for determining video material of news
CN114676255A (en) Text processing method, device, equipment, storage medium and computer program product
WO2019160096A1 (en) Relationship estimation model learning device, method, and program
CN113836303A (en) Text type identification method and device, computer equipment and medium
CN114897060A (en) Training method and device of sample classification model, and sample classification method and device
CN117271736A (en) Question-answer pair generation method and system, electronic equipment and storage medium
CN114722174A (en) Word extraction method and device, electronic equipment and storage medium
WO2019163642A1 (en) Summary evaluation device, method, program, and storage medium
CN114358020A (en) Disease part identification method and device, electronic device and storage medium
CN111597302B (en) Text event acquisition method and device, electronic equipment and storage medium
CN114722774B (en) Data compression method, device, electronic equipment and storage medium
CN114611529B (en) Intention recognition method and device, electronic equipment and storage medium
CN114398903B (en) Intention recognition method, device, electronic equipment and storage medium
CN114492437B (en) Keyword recognition method and device, electronic equipment and storage medium
CN115795007A (en) Intelligent question-answering method, intelligent question-answering device, electronic equipment and storage medium
CN116362252A (en) Entity relationship identification method and device, electronic equipment and storage medium
CN113408292A (en) Semantic recognition method and device, electronic equipment and computer-readable storage medium
CN117520550A (en) Intention classification method, intention classification device, electronic device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination