CN116127081A

CN116127081A - Data processing method, device and computer readable storage medium

Info

Publication number: CN116127081A
Application number: CN202111346297.3A
Authority: CN
Inventors: 蒋乐怡
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-11-15
Filing date: 2021-11-15
Publication date: 2023-05-16

Abstract

The embodiment of the application discloses a data processing method, equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring a text, and acquiring a shared semantic vector corresponding to each word in the text; based on the shared semantic vector, acquiring a first entity word belonging to a first part of speech in the text; acquiring a conditional shared semantic vector corresponding to the first entity word from the shared semantic vector, and carrying out vector fusion on the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector; carrying out vector recognition on the target semantic vector to obtain a triplet comprising a first entity word, a second entity word belonging to a second part of speech and a relation entity word; the second entity word in the triplet belongs to text; the relationship entity words in the triples are used for representing the association relationship between the first entity words and the second entity words in the triples. By adopting the method and the device, the recognition rate of the association relationship between the first entity word and the second entity word can be improved.

Description

Data processing method, device and computer readable storage medium

Technical Field

The present application relates to the field of internet technologies, and in particular, to a data processing method, a data processing device, and a computer readable storage medium.

Background

With the rapid development of artificial intelligence, intelligent data analysis gradually replaces traditional artificial data analysis, for example, business companies begin to utilize artificial intelligence to realize automatic data analysis.

The entity extraction of the text and the relationship identification among the entities are common scenes in automatic data analysis, and the existing relationship identification method among the entities is that service personnel preset relationship entity words (representing the association relationship between a subject and an object) corresponding to object entity words, namely, the mapping relationship between the object entity words and the relationship entity words is obtained. When the association relation between the main and guest entity words in the text needs to be identified, the main entity word and the guest entity word in the text are extracted, and then the relation pointed by the relation entity word with the mapping relation with the guest entity word is determined as the association relation between the main entity word and the guest entity word. For example, a business person builds a mapping relationship between an abnormal activity a (belonging to a guest entity word) and a suspected (belonging to a relationship entity word) in advance, and then recognizes the text "company c seriously indicates, hits the abnormal activity a", and extracts the company c (belonging to a subject entity word) and the abnormal activity a based on the existing method, so as to generate a triplet (company c, suspected, abnormal activity a), but in fact, the company c and the abnormal activity a do not have a suspected association relationship, so that the existing relationship recognition method may erroneously extract the association relationship between entities, that is, the relationship recognition rate between entities is reduced.

Disclosure of Invention

The embodiment of the application provides a data processing method, data processing equipment and a computer readable storage medium, which can improve the recognition rate of the association relationship between a first entity word and a second entity word.

In one aspect, an embodiment of the present application provides a data processing method, including:

acquiring a text, and acquiring a shared semantic vector corresponding to each word in the text;

based on the shared semantic vector, acquiring a first entity word belonging to a first part of speech in the text;

acquiring a conditional shared semantic vector corresponding to the first entity word from the shared semantic vector, and carrying out vector fusion on the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector;

carrying out vector recognition on the target semantic vector to obtain a triplet comprising a first entity word, a second entity word belonging to a second part of speech and a relation entity word; the second entity word in the triplet belongs to text; the relationship entity words in the triples are used for representing the association relationship between the first entity words and the second entity words in the triples.

acquiring a training sample set; the training sample set comprises sample text, a first tag entity word in the sample text and a tag triplet associated with the sample text; the part of speech of the first tag entity word belongs to the first part of speech; the tag triples comprise a first tag entity word, a second tag entity word belonging to a second entity word and a tag relation entity word; the first part of speech is different from the second part of speech; the second tag entity word in the tag triplet belongs to the sample text; the label relation entity words in the label triples are used for representing association relations between the first label entity words and the second label entity words in the label triples;

Inputting the sample text into a text recognition initial model, and acquiring a prediction sharing semantic vector corresponding to each sample word in the sample text in the text recognition initial model;

based on the prediction sharing semantic vector, acquiring a first prediction entity word in the sample text;

obtaining a prediction condition sharing semantic vector corresponding to the first prediction entity word from the prediction sharing semantic vector, and carrying out vector fusion on the prediction condition sharing semantic vector and the prediction sharing semantic vector to obtain a prediction target semantic vector;

vector recognition is carried out on the predicted target semantic vector, and a predicted triplet comprising a first predicted entity word, a second predicted entity word and a predicted relation entity word is obtained; the second predicted entity word in the predicted triplet belongs to the sample text;

according to the first predicted entity word, the first tag entity word, the predicted triplet and the tag triplet, parameters in the initial text recognition model are adjusted, and a text recognition model is generated; the text recognition model is used to generate triples for text.

An aspect of an embodiment of the present application provides a data processing apparatus, including:

the first acquisition module is used for acquiring texts and acquiring shared semantic vectors corresponding to each word in the texts respectively;

The second acquisition module is used for acquiring a first entity word belonging to the first part of speech in the text based on the shared semantic vector;

the first generation module is used for acquiring a conditional shared semantic vector corresponding to the first entity word from the shared semantic vector, and carrying out vector fusion on the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector;

the second generation module is used for carrying out vector recognition on the target semantic vector to obtain a triplet comprising a first entity word, a second entity word belonging to a second part of speech and a relation entity word; the second entity word in the triplet belongs to text; the relationship entity words in the triples are used for representing the association relationship between the first entity words and the second entity words in the triples.

Wherein, the data processing device still includes:

the first acquisition module is also used for acquiring a text recognition model and inputting the text into the text recognition model; the text recognition model comprises an input layer and a shared coding layer;

the first acquisition module is used for carrying out segmentation processing on the text based on the input layer to obtain at least two segmentation words; at least two of the segmentations include segment E _f F is a positive integer, and f is less than or equal to the total number corresponding to at least two segmentation words;

A third obtaining module for obtaining the segmentation E _f Location information in text will be for word E _f A position information input sharing coding layer;

a third generation module forBased on the shared coding layer, the word E is segmented _f Vector encoding is carried out on the position information of the word to obtain a word E _f A corresponding shared location vector;

the first generation module includes:

a determining position unit, configured to determine position information of the first entity word in the text; the position information of the first entity word in the text belongs to the position information of at least two segmentation words in the text respectively;

the first acquisition unit is used for acquiring the shared position vector corresponding to the first entity word from the shared position vectors corresponding to the at least two segmentation words respectively based on the position information of the first entity word in the text;

the second obtaining unit is used for obtaining the conditional shared semantic vector corresponding to the first entity word from the shared semantic vector based on the shared position vector corresponding to the first entity word.

Wherein, the first generation module includes:

a third acquisition unit configured to acquire a text recognition model; the text recognition model comprises a first coding layer; the first encoding layer includes a self-attention component, a first normalization component, a feed-forward component, and a second normalization component;

The first input unit is used for inputting the shared semantic vector into the self-attention component, and carrying out vector coding on the shared semantic vector based on the self-attention component to obtain a first semantic vector to be normalized;

the second input unit is used for respectively inputting the first semantic vector to be normalized and the shared semantic vector into the first normalization component, and carrying out weighted fusion on the first semantic vector to be normalized and the shared semantic vector based on the first normalization component to obtain a semantic vector to be fed forward;

the third input unit is used for inputting the semantic vector to be fed forward to the feedforward component, and carrying out vector coding on the semantic vector to be fed forward based on the feedforward component to obtain a second semantic vector to be normalized;

the fourth input unit is used for respectively inputting the conditional sharing semantic vector, the second semantic vector to be normalized and the semantic vector to be feedforward into the second normalization component, and carrying out vector fusion on the conditional sharing semantic vector, the second semantic vector to be normalized and the semantic vector to be feedforward based on the second normalization component to obtain a target semantic vector.

The second normalization component comprises an average sub-component, a distance sub-component, a standard sub-component, a scaling sub-component, a weighting sub-component and a fusion sub-component;

A fourth input unit including:

the first generation subunit is used for carrying out vector average on the second semantic vector to be normalized and the semantic vector to be fed forward based on the average subassembly to obtain an average semantic vector;

the second generating subunit is used for acquiring vector distances between the second semantic vector to be normalized and the average semantic vector based on the distance subassembly to obtain a first distance vector, and acquiring vector distances between the semantic vector to be fed forward and the average semantic vector to obtain a second distance vector;

the third generation subunit is used for carrying out vector standard on the second semantic vector to be normalized and the semantic vector to be fed forward based on the standard subassembly to obtain a standard semantic vector;

the fourth generation subunit is used for carrying out vector scaling on the first distance vector and the standard semantic vector based on the scaling subassembly to obtain a first scaling vector, and carrying out vector scaling on the second distance vector and the standard semantic vector to obtain a second scaling vector;

a fifth generating subunit, configured to generate a first weight feature corresponding to the conditional sharing semantic vector, and a second weight feature corresponding to the conditional sharing semantic vector;

the sixth generation subunit is used for carrying out weighted fusion on the first scaling vector, the second scaling vector and the first weight characteristic based on the weighting sub-assembly to obtain a semantic vector to be fused;

And the seventh generation subunit is used for carrying out vector fusion on the second weight characteristic and the semantic vector to be fused based on the fusion subunit to obtain a target semantic vector.

Wherein, the second generation module includes:

a fourth acquisition unit configured to acquire a text recognition model; the text recognition model comprises a relation recognition layer;

the fourth acquisition unit is further used for inputting the target semantic vector into the relation recognition layer, and carrying out vector recognition on the target semantic vector based on the relation recognition layer to obtain a recognition semantic vector;

the first generation unit is used for generating a triplet comprising a first entity word, a second entity word belonging to a second part of speech and a relation entity word based on the recognition semantic vector.

Wherein, first acquisition module includes:

a fifth acquisition unit for acquiring a text recognition model, and inputting the text into the text recognition model; the text recognition model comprises an input layer and a shared coding layer;

the second generation unit is used for carrying out segmentation processing on the text based on the input layer to obtain at least two segmented words, and respectively inputting the at least two segmented words into the shared coding layer;

and the third generating unit is used for respectively carrying out vector coding on at least two segmented words based on the shared coding layer to obtain shared semantic vectors respectively corresponding to each segmented word.

Wherein, the second acquisition module includes:

a sixth acquisition unit configured to acquire a text recognition model; the text recognition model comprises a second coding layer, an entity recognition layer and a decoding layer;

the fifth input unit is used for inputting the shared semantic vector into the second coding layer, and carrying out vector coding on the shared semantic vector based on the second coding layer to obtain a semantic vector to be identified;

the sixth input unit is used for inputting the semantic vector to be identified into the entity identification layer, and carrying out vector identification on the semantic vector to be identified based on the entity identification layer to obtain a semantic vector to be decoded for representing the first entity word;

the seventh input unit is configured to input the semantic vector to be decoded, which is used for representing the first entity word, into the decoding layer, and perform vector decoding on the semantic vector to be decoded, which is used for representing the first entity word, based on the decoding layer, so as to obtain the first entity word belonging to the first part of speech in the text.

Wherein the semantic vector to be identified comprises a semantic vector A to be identified _b B is a positive integer, and b is less than or equal to the total number corresponding to the semantic vectors to be identified; the entity recognition layer comprises a semantic vector A aiming at to-be-recognized _b Entity identification component C of (a) _b ；

A sixth input unit including:

An eighth generation subunit for identifying the component C if an entity _b For the first entity recognition component in the entity recognition layer, then at entity recognition component C _b Semantic vector A to be recognized _b Vector recognition is carried out to obtain a semantic vector A to be recognized _b Corresponding semantic vector D to be decoded _b ；

A ninth generation subunit for identifying the component C if the entity _b Not the first entity identification component in the entity identification layer, then at entity identification component C _b In the semantic vector A to be recognized _b And semantic vector A to be identified _b Vector fusion is carried out on the target semantic vector to be identified with the position association relationship to obtain a semantic vector A to be identified _b Corresponding semantic vector D to be decoded _b The method comprises the steps of carrying out a first treatment on the surface of the The target semantic vector to be identified belongs to the semantic vector to be identified;

tenth generation subunit, configured to obtain, based on the semantic vectors to be decoded corresponding to each semantic vector to be identified, a semantic vector to be decoded that is used to characterize the first entity word.

Wherein the triples include at least two triples for the first entity word; at least two triples comprising triplet G _h H is a positive integer, and h is less than or equal to the total number of at least two triples;

the data processing apparatus further includes:

a first determining module for determining the triplet G _h The association relationship attribute represented by the relationship entity word in the list is a negative association relationship, and if the association relationship attribute is a negative association relationship, the object represented by the first entity word is determined to be a target object;

the second determining module is used for acquiring triples including short-term relation entity words from at least two triples, and determining second entity words in the triples including short-term relation entity words as short-term entity words; the object represented by the entity word for short is equal to the target object; the relationship entity word for short refers to the relationship entity word with the attribute of the characterized association relationship being the association relationship for short;

the second determining module is used for acquiring triples comprising attribution relation entity words from at least two triples, and determining second entity words in the triples comprising attribution relation entity words as attribution entity words; the object characterized by the attribution entity word is attributed to the target object; the attribution relation entity word refers to the relation entity word with the attribute of the characterized association relation as attribution association relation;

an association storage module for associating and storing the triplet G _h The relationship entity word, the attribution relationship entity word and the attribution entity word.

The first acquisition module is used for acquiring a training sample set; the training sample set comprises sample text, a first tag entity word in the sample text and a tag triplet associated with the sample text; the part of speech of the first tag entity word belongs to the first part of speech; the tag triples comprise a first tag entity word, a second tag entity word belonging to a second entity word and a tag relation entity word; the first part of speech is different from the second part of speech; the second tag entity word in the tag triplet belongs to the sample text; the label relation entity words in the label triples are used for representing association relations between the first label entity words and the second label entity words in the label triples;

the first acquisition module is also used for inputting the sample text into a text recognition initial model, and in the text recognition initial model, the prediction sharing semantic vector corresponding to each sample word in the sample text is acquired;

the second acquisition module is used for acquiring a first predicted entity word in the sample text based on the predicted shared semantic vector;

the first generation module is used for acquiring a prediction condition sharing semantic vector corresponding to the first prediction entity word from the prediction sharing semantic vector, and carrying out vector fusion on the prediction condition sharing semantic vector and the prediction sharing semantic vector to obtain a prediction target semantic vector;

The second generation module is used for carrying out vector recognition on the prediction target semantic vector to obtain a prediction triplet comprising a first prediction entity word, a second prediction entity word and a prediction relation entity word; the second predicted entity word in the predicted triplet belongs to the sample text;

the third generation module is used for adjusting parameters in the initial text recognition model according to the first predicted entity word, the first tag entity word, the predicted triplet and the tag triplet to generate a text recognition model; the text recognition model is used to generate triples for text.

Wherein, the third generation module includes:

the first generation unit is used for generating an entity loss value according to the first predicted entity word and the first tag entity word;

the second generation unit is used for generating a relation loss value according to the prediction triplet and the label triplet;

the third generation unit is used for determining a total loss value corresponding to the text recognition initial model according to the entity loss value and the relation loss value;

and the third generation unit is also used for adjusting parameters in the text recognition initial model according to the total loss value to generate a text recognition model.

In one aspect, the present application provides a computer device comprising: a processor, a memory, a network interface;

The processor is connected to the memory and the network interface, where the network interface is used to provide a data communication function, the memory is used to store a computer program, and the processor is used to call the computer program to make the computer device execute the method in the embodiment of the present application.

In one aspect, embodiments of the present application provide a computer readable storage medium having a computer program stored therein, the computer program being adapted to be loaded by a processor and to perform a method according to embodiments of the present application.

In one aspect, embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium; the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the methods in the embodiments of the present application.

In the embodiment of the application, the computer equipment can acquire a first entity word belonging to a first part of speech in the text by acquiring a shared semantic vector corresponding to each word in the text, further acquire a conditional shared semantic vector corresponding to the first entity word in the shared semantic vector, and perform vector fusion on the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector; and carrying out vector recognition on the target semantic vector to obtain a triplet comprising the first entity word, the second entity word belonging to the second part of speech and the relation entity word, wherein the relation entity word in the triplet can be used for representing the association relation between the first entity word and the second entity word in the triplet. As can be seen from the foregoing, the present application first obtains a first entity word in a text, inputs a conditional shared semantic vector for the first entity word as a condition, and fuses the conditional shared semantic vector with the shared semantic vector, so that the target semantic vector is a semantic vector including a condition, that is, the first entity word is identified as a condition, so that under the condition of the given first entity word, a second entity word in the text and the first entity word have an association relationship pointed by a relationship entity word, and therefore, the present application embodiment can improve the identification rate of the association relationship between the first entity word and the second entity word.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a system architecture according to an embodiment of the present application;

FIG. 2 is a schematic view of a scenario of data processing provided in an embodiment of the present application;

FIG. 3 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of a text recognition model according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a first coding layer according to an embodiment of the present application;

FIG. 6 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 7 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

FIG. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

For ease of understanding, the following simple explanation of partial nouns is first made:

artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.

Natural language processing (Nature Language processing, NLP) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like. In embodiments of the present application, natural language processing techniques may be used to identify entity words (e.g., company names, activity names, etc.) in text.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like. In the embodiment of the application, the text recognition model is an AI model based on a machine learning technology, and can be used for recognizing texts.

Referring to fig. 1, fig. 1 is a schematic diagram of a system architecture according to an embodiment of the present application. As shown in fig. 1, the system may include a service server 100 and a terminal cluster, and the terminal cluster may include: the terminal device 200a, the terminal device 200b, the terminal devices 200c, …, and the terminal device 200n, it will be appreciated that the above system may include one or more terminal devices, and the number of terminal devices is not limited in this application.

Wherein a communication connection may exist between the terminal clusters, for example, a communication connection exists between the terminal device 200a and the terminal device 200b, and a communication connection exists between the terminal device 200a and the terminal device 200 c. Meanwhile, any terminal device in the terminal cluster may have a communication connection with the service server 100, for example, a communication connection exists between the terminal device 200a and the service server 100, where the communication connection is not limited to a connection manner, may be directly or indirectly connected through a wired communication manner, may also be directly or indirectly connected through a wireless communication manner, and may also be other manners, which are not limited herein.

It should be understood that each terminal device in the terminal cluster shown in fig. 1 may be provided with an application client, and when the application client runs in each terminal device, the application client may perform data interaction, i.e. the above communication connection, with the service server 100 shown in fig. 1. The application client may be an application client with a text recognition function, such as a short video application, a live broadcast application, a social application, an instant messaging application, a game application, a music application, a shopping application, a novel application, a browser, and the like. The application client may be an independent client, or may be an embedded sub-client integrated in a client (for example, a social client, an educational client, and a multimedia client), which is not limited herein. Taking a browser as an example, the service server 100 may be a set including a plurality of servers such as a background server and a data processing server corresponding to the browser, so that each terminal device may perform data transmission with the service server 100 through an application client corresponding to the browser, for example, each terminal device may upload a local text to the service server 100 through the application client of the browser, the service server 100 may identify the text, determine a first entity word and a second entity word in the text, determine an association relationship between the first entity word and the second entity word, and then return a triplet including the first entity word, the second entity word and a relationship entity word representing the association relationship between the first entity word and the second entity word to the terminal device or transmit the triplet to the cloud server.

It will be appreciated that in the specific embodiments of the present application, related data such as user information, e.g., text including user information, may be relevant to obtain user approval or consent when the embodiments of the present application are applied to specific products or technologies, and the collection, use and processing of the relevant data may be required to comply with relevant national and regional laws and regulations and standards.

For the convenience of subsequent understanding and description, the embodiment of the present application may select one terminal device as a target terminal device in the terminal cluster shown in fig. 1, for example, use the terminal device 200a as a target terminal device. When the text is acquired and an identification instruction for the relationship between entities of the text is received, the terminal device 200a may transmit the text as query information to the service server 100. Further, after receiving the text sent by the terminal device 200a, the service server 100 may generate a shared semantic vector corresponding to each word in the text through a shared coding layer in the text recognition model, and based on the shared semantic vector, the service server 100 may obtain a first entity word belonging to the first part of speech in the text; further, the service server 100 obtains a conditional shared semantic vector corresponding to the first entity word from the shared semantic vector on the condition of the first entity word, and performs vector fusion on the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector on the condition of the conditional shared semantic vector; further, the service server 100 performs vector recognition on the target semantic vector to obtain a triplet including the first entity word, the second entity word belonging to the second part of speech, and the relationship entity word, where the second entity word in the triplet belongs to text, and the relationship entity word in the triplet is used to represent an association relationship between the first entity word and the second entity word in the triplet. Obviously, the extraction process of the second entity word in the embodiment of the application is inconsistent with the extraction process of the first entity word, and the relation recognition rate can be improved by recognizing the association relation between the first entity word and the second entity word in a manner of determining that the second entity word has a certain relation on the condition of the first entity word.

Subsequently, the service server 100 may transmit the triplet to the terminal device 200a, and the terminal device 200a may display the triplet on its corresponding screen after receiving the triplet transmitted by the service server 100.

Optionally, if the text recognition model is stored locally in the terminal device 200a, the terminal device 200a may obtain the shared semantic vector corresponding to each word in the text through the text recognition model, then obtain the first entity word based on the shared semantic vector, and obtain the conditional shared semantic vector based on the first entity word, and the subsequent process is consistent with the description above, so that the description is omitted here. Wherein the text recognition model local to the terminal device 200a may be transmitted to the terminal device 200a by the service server 100.

Alternatively, it will be understood that a system architecture may include a plurality of service servers, and a terminal device may be connected to a service server, where each service server may obtain text uploaded by the terminal device connected to the service server, so that the text may be identified, and a triplet associated with the text may be obtained and returned to the terminal device connected to the service server.

It should be noted that, the service server 100, the terminal device 200a, the terminal device 200b, and the terminal device 200c may be blockchain nodes in a blockchain network, and the data (for example, text and triplets) described in full text may be stored in association, where the storage manner may be a manner that the blockchain nodes generate blocks according to the data and add the blocks to the blockchain for storage.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like, and is mainly used for sorting data according to time sequence, encrypting the data into an account book, preventing the account book from being tampered and forged, and simultaneously verifying, storing and updating the data. A blockchain is essentially a de-centralized database in which each node stores an identical blockchain, and a blockchain network can distinguish nodes into core nodes, data nodes, and light nodes. The core nodes, data nodes and light nodes together form a blockchain node. The core node is responsible for the consensus of the whole blockchain network, that is to say, the core node is a consensus node in the blockchain network. The process of writing the transaction data in the blockchain network into the ledger may be that a data node or a light node in the blockchain network acquires the transaction data, transfers the transaction data in the blockchain network (that is, the node transfers in a baton manner) until the transaction data is received by a consensus node, packages the transaction data into a block, performs consensus on the block, and writes the transaction data into the ledger after the consensus is completed. Here, the transaction data is exemplified by text and triples, and after passing through consensus of the transaction data, the service server 100 (blockchain node) generates blocks according to the transaction data, and stores the blocks into the blockchain network; for reading transaction data (i.e., text and triplets), a block containing the transaction data may be obtained by the blockchain node in the blockchain network, and further, the transaction data may be obtained in the block.

It is understood that the method provided in the embodiments of the present application may be performed by a computer device, including but not limited to a terminal device or a service server. The service server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing a cloud database, cloud service, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, basic cloud computing service such as big data and an artificial intelligence platform. The terminal equipment comprises, but is not limited to, mobile phones, computers, intelligent voice interaction equipment, intelligent household appliances, vehicle-mounted terminals and the like. The terminal device and the service server may be directly or indirectly connected through a wired or wireless manner, which is not limited herein.

It can be understood that the system architecture can be applied to business scenarios such as entity word extraction scenarios for text, entity word relationship identification scenarios for text, entity word matching scenarios, and the like, and specific business scenarios will not be listed here.

Further, referring to fig. 2, fig. 2 is a schematic view of a scenario of data processing according to an embodiment of the present application. The implementation process of the data processing scenario may be performed in a service server, or may be performed in a terminal device, or may be performed interactively in a terminal device and a service server, where the terminal device may be any one of terminal devices in a terminal cluster in the embodiment corresponding to fig. 1, and fig. 2 is described by taking a terminal device 200a as an example, and the service server may be the service server 100 in the embodiment corresponding to fig. 1. The embodiments of the present application may be applied to various scenarios including, but not limited to, cloud technology, artificial intelligence, intelligent transportation, assisted driving, and the like.

As shown in fig. 2, the device user 2A sends the text 20a and a relationship recognition request between entities for the text 20a to the service server 100 through the terminal device 200a, and after the service server 100 obtains the relationship recognition request and the text 20a, obtains a trained text recognition model, where the text recognition model may be divided into 3 models, such as a shared recognition network 201b, a first recognition network 202b, and a second recognition network 203b illustrated in fig. 2, and for model structures corresponding to the text recognition model, the shared recognition network 201b, the first recognition network 202b, and the second recognition network 203b, respectively, a description will not be made here, and please refer to a description in an embodiment corresponding to fig. 3 below.

The service server 100 inputs the text 20a to the shared recognition network 201b, the shared recognition network 201b encodes each word in the text 20a, outputs a shared semantic vector corresponding to the word, and as illustrated in fig. 2, the shared semantic vectors 201c and … and the shared semantic vectors 20nc and n are positive integers greater than 1, and can characterize the total number corresponding to the shared semantic vectors. The service server 100 inputs the shared semantic vectors 201c, … and the shared semantic vector 20nc into the first recognition network 202b, the first recognition network 202b respectively and correspondingly recognizes the shared semantic vectors 201c, … and the shared semantic vector 20nc, and outputs the first entity words belonging to the first part of speech in the text 20a, and it can be understood that the total number of the first entity words can be equal to 1 or greater than 1, and the total number can be determined according to the content of the text 20 a.

When the total number of the first entity words is greater than 1, the service server 100 selects one entity word from the first entity words as a conditional entity word, and obtains a conditional shared semantic vector corresponding to the conditional entity word from the shared semantic vectors 201c, … and the shared semantic vector 20nc, for example, the conditional shared semantic vector 202c illustrated in fig. 2, further, the service server 100 inputs the conditional shared semantic vector, the shared semantic vectors 201c, … and the shared semantic vector 20nc to the second recognition network 203b, and the second recognition network 203b performs vector fusion on the conditional shared semantic vector with the shared semantic vectors 201c, … and the shared semantic vector 20nc to obtain a target semantic vector, and it may be understood that the total number corresponding to the target semantic vector is equal to n. Further, the service server 100 performs vector recognition on the target semantic vector through the second recognition network 203b to obtain a triplet including the first entity word, the second entity word belonging to the second part of speech, and the relationship entity word that can represent the association relationship between the first entity word and the second entity word, where the second entity word belongs to the text 20a, and the relationship entity word belongs to the entity word pointed by the relationship recognition layer in the second recognition network 203 b. It can be appreciated that, in the case that the total number of the first entity words is equal to 1, the process of generating the triplet for the text 20a is consistent with the above process, so that a detailed description thereof will be omitted herein.

Referring to fig. 2 again, after the service server 100 generates a triplet for the text 20a, the triplet is used as recommendation information, and the triplet is returned to the terminal device 200a, after the terminal device 200a obtains the triplet, downstream processing can be performed on the target object pointed by the first entity word according to the second entity word and the relationship entity word in the triplet, for convenience of understanding, in this embodiment, a downstream task is exemplified, for example, the triplet is (company c, participation, illegal organization), and then the device user 2A can refer to the triplet to pay attention to the activity of the company c (target object) or stop coming and going with the company c; for example, the triplet is (account z, send, spam), then the terminal device 200a may put account z (target object) on the mailbox blacklist; it can be understood that the text type and text content of the text 20a are not limited in the embodiment of the present application, so the type of the triplet is not limited either, and can be set according to the actual application scenario.

Further, referring to fig. 3, fig. 3 is a flow chart of a data processing method according to an embodiment of the present application. The data processing method may be performed by a service server (e.g., the service server 100 shown in fig. 1 described above), or may be performed by a terminal device (e.g., the terminal device 200a shown in fig. 1 described above), or may be performed interactively by the service server and the terminal device. For ease of understanding, embodiments of the present application will be described with this method being performed by a service server as an example. As shown in fig. 3, the data processing method may include at least the following steps S101 to S104.

Step S101, acquiring a text, and acquiring shared semantic vectors corresponding to each word in the text.

Specifically, a text recognition model is obtained, and a text is input into the text recognition model; the text recognition model comprises an input layer and a shared coding layer; the text is segmented based on the input layer to obtain at least two segmented words, and the at least two segmented words are respectively input into the shared coding layer; and respectively carrying out vector coding on at least two segmented words based on the shared coding layer to obtain shared semantic vectors respectively corresponding to each segmented word.

The text is acquired by the service server, and the text type and the text content are not limited in the embodiment of the present application, and may be, for example, mail text, physical examination text, poetry text, etc., and it should be noted that all data (for example, text) mentioned in the present application is processed after acquiring the rights granted by the terminal object. Referring to fig. 4, fig. 4 is a schematic structural diagram of a text recognition model according to an embodiment of the present application. As shown in fig. 4, the service server obtains a text recognition model, where the text recognition model may include an input layer and a shared code layer, and it is understood that a network formed by the input layer and the shared code layer may be identical to the shared recognition network 201b in the embodiment corresponding to fig. 2 above.

Further, the service server inputs the text 40a to an input layer of the text recognition model, and performs segmentation processing on the text 40a based on the input layer to obtain at least two segmentation words, such as segmentation word X illustrated in fig. 4 ₁ 、X ₂ 、X ₃ 、…、X _n N is a positive integer greater than 1, and n may characterize the total number of correspondence of at least two of the tokens. The embodiment of the application does not limit the word types, and can be words, for example, the text comprises 'patient fever', and the at least two words can be 4 words, namely 'patient', 'fever'; the word may also be a sub-word, for example, the text includes "advertising isthe department of company", and then at least two words may include ad, ver, tis, ing, is, the, de, pa, rt, ment, of, com, pany, and it may be understood that the attribute of the word may be set according to the actual application scenario.

The service server inputs at least two segmentation words into a shared coding layer and is based on the shared coding layerFirst, initial vectors respectively corresponding to at least two segmented words are obtained, such as segmented word X shown in FIG. 4 ₁ Corresponding initial vector E ₁ Word X ₂ Corresponding initial vector E ₂ Word X ₃ Corresponding initial vector E ₃ …, word X _n Corresponding initial vector E _n Then respectively encoding the at least two initial vectors to obtain at least two shared semantic vectors respectively corresponding to the segmented words, such as segmented word X shown in figure 4 ₁ Corresponding shared semantic vector G ₁ Word X ₂ Corresponding shared semantic vector G ₂ Word X ₃ Corresponding shared semantic vector G ₃ …, word X _n Corresponding shared semantic vector G _n . It should be understood that, in practical application, the above-mentioned shared coding layer may be an independent deep neural network or at least one deep convolutional layer (the number of the convolutional layers is not limited in this application), for example, the shared coding layer is a bi-directional language model obtained through pre-training, including but not limited to an ELMO network (Embedding from Language Models) and a BERT network (Bidirectional Encoder Representation from Transformers), and based on this, in this embodiment of the present application, the shared coding layer needs to be pre-trained based on a dictionary database to generate a deep bi-directional language model.

Step S102, based on the shared semantic vector, a first entity word belonging to a first part of speech in the text is obtained.

Specifically, a text recognition model is obtained; the text recognition model comprises a second coding layer, an entity recognition layer and a decoding layer; inputting the shared semantic vector into a second coding layer, and carrying out vector coding on the shared semantic vector based on the second coding layer to obtain a semantic vector to be identified; inputting the semantic vector to be identified into an entity identification layer, and carrying out vector identification on the semantic vector to be identified based on the entity identification layer to obtain a semantic vector to be decoded for representing the first entity word; inputting the semantic vector to be decoded for representing the first entity word into a decoding layer, and carrying out vector decoding on the semantic vector to be decoded for representing the first entity word based on the decoding layer to obtain the first entity word belonging to the first part of speech in the text.

Wherein the semantic vector to be identified comprises a semantic vector A to be identified _b B is a positive integer, and b is less than or equal to the total number corresponding to the semantic vectors to be identified; the entity recognition layer comprises a semantic vector A aiming at to-be-recognized _b Entity identification component C of (a) _b The method comprises the steps of carrying out a first treatment on the surface of the The specific process of obtaining the semantic vector to be decoded for characterizing the first entity word may include: if the entity identifies the component C _b For the first entity recognition component in the entity recognition layer, then at entity recognition component C _b Semantic vector A to be recognized _b Vector recognition is carried out to obtain a semantic vector A to be recognized _b Corresponding semantic vector D to be decoded _b The method comprises the steps of carrying out a first treatment on the surface of the If the entity identifies the component C _b Not the first entity identification component in the entity identification layer, then at entity identification component C _b In the semantic vector A to be recognized _b And semantic vector A to be identified _b Vector fusion is carried out on the target semantic vector to be identified with the position association relationship to obtain a semantic vector A to be identified _b Corresponding semantic vector D to be decoded _b The method comprises the steps of carrying out a first treatment on the surface of the The target semantic vector to be identified belongs to the semantic vector to be identified; and obtaining the semantic vector to be decoded for representing the first entity word based on the semantic vector to be decoded corresponding to each semantic vector to be identified.

The entity extraction refers to extracting a corresponding entity from the text, and the embodiment of the application does not limit the entity attribute, and can be a medical entity word, a geographic entity word, a company entity word and the like, and can be set according to an actual application scene.

Referring to fig. 4 again, the text recognition model includes a second encoding layer, a physical word, and a decoding layer, where a network formed by the second encoding layer, the physical word, and the decoding layer may be identical to the first recognition network 202b in the embodiment corresponding to fig. 2. The service server will share at least two semantic vectors (e.g. shared semantic vector G in fig. 4) ₁ Sharing semantic vector G ₂ Sharing semantic vector G ₃ …, shared semantic vector G _n ) Respectively inputting the two shared semantic vectors into a second coding layer, and respectively coding the at least two shared semantic vectors based on the second coding layer to obtainAt least two word segments respectively corresponding to semantic vectors to be recognized, such as word segment X shown in figure 4 ₁ Corresponding semantic vector T to be identified ₁ Word X ₂ Corresponding semantic vector T to be identified ₂ Word X ₃ Corresponding semantic vector T to be identified ₃ …, word X _n Corresponding semantic vector T to be identified _n . It should be understood that, in practical application, the second coding layer may be an independent deep neural network or at least one deep convolutional layer (the number of the convolutional layers is not limited in this application), for example, the shared coding layer is a bi-directional language model obtained through pre-training, including but not limited to an ELMO network (Embedding from Language Models) and a BERT network (Bidirectional Encoder Representation from Transformers), and based on this, in this embodiment of the present application, the shared coding layer needs to be pre-trained based on a dictionary database to generate a deep bi-directional language model.

Further, the service server inputs at least two semantic vectors to be identified into the entity identification layer, and performs identification processing on the at least two semantic vectors to be identified based on the entity identification layer to obtain semantic vectors to be decoded corresponding to each word, where the semantic vectors to be decoded can represent the position of the word corresponding to the semantic vector to be decoded in a certain first entity word and the entity type corresponding to the first entity word, as illustrated in fig. 4, the word X ₁ The corresponding semantic vector to be decoded (e.g., B-com as illustrated in FIG. 4) may characterize the word segment X ₁ For the starting position of the first entity word com (representing the entity word of the company), word X is segmented ₂ The corresponding semantic vector to be decoded (I-com as illustrated in fig. 4) may characterize the word segment X ₂ For the non-starting position of the first entity word com (representing the entity word of the company), word X is segmented ₃ The corresponding semantic vector to be decoded (e.g., O as illustrated in fig. 4) may characterize the word segment X ₃ Not the first entity word, so the business server can determine the word X ₁ Word X ₂ A complete first entity word com may be composed. Further, the service server inputs the semantic vectors to be decoded, which correspond to each word, into a decoding layer, and based on the decoding layer, the semantic vectors to be decoded, which correspond to each word, respectively Vector decoding is performed to obtain a first entity word belonging to the first part of speech in the text, the embodiment of the application does not limit the total number corresponding to the first entity word, and may be multiple or one, for example, the service server performs entity extraction on the text "patient fever", extracts the first entity word "fever", for example, performs extraction on the text "cosmetic packaging disqualification-! Brands such as brand 1, brand 2, etc.; and (3) deep sinking the illegal organization 3' by the female in six ten days, and performing entity extraction to obtain a first entity word brand 1, a brand 2 and the illegal organization 3. It will be appreciated that the first entity word may include one or more tokens, and that the first entity word com illustrated in FIG. 4 includes token X ₁ Word X ₂ 。

The first part of speech may include a subject part of speech, i.e., the first entity word is the subject entity word, and the second part of speech may include an object part of speech, i.e., the second entity word is the object entity word. It may be appreciated that the first part of speech may be an object part of speech, where the first entity word is a guest entity word, and the second part of speech may include a subject part of speech, where the second entity word is a subject entity word; it is to be understood that the first part of speech may be another part of speech, and the second part of speech may correspond to the first part of speech, which is not limited in the embodiment of the present application.

It should be understood that, in practical application, the entity identification layer may be a separate deep neural network or at least one deep convolutional layer (the number of the convolutional layers is not limited in the application), for example, the entity identification layer is a conditional random field (Conditional Random Field, CRF) obtained through pre-training. Based on this, in the embodiment of the present application, the entity recognition layer needs to be trained in advance based on the dictionary database to generate the CRF model. Alternatively, the present application may use a classifier to sequence labels, i.e., output a 0, 1 tag, where 1 indicates that the word should be extracted as part of the first entity word.

In the named subject recognition task, the output of each item (or each position) affects the output of the following item, for example, in part-of-speech recognition, the output of the former item is a verb, and the output of the latter item is very unlikely to be a verb. The entity recognition layer passes the label transfer constraint such that each item output will depend on the output of the previous layer of the item and the output of the previous layer of the previous item, and the process can be represented by equation (1).

Z _i ＝y _i-1 G⊙y _i (1)

In formula (1), Z _i An output representing the i-th position of the current layer, G being the transition matrix, y _i For the output of the previous layer of the current position, y _i-1 The output of the previous layer for the previous position.

Step S103, a conditional shared semantic vector corresponding to the first entity word is obtained from the shared semantic vector, and vector fusion is carried out on the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector.

Specifically, a text recognition model is obtained, and a text is input into the text recognition model; the text recognition model comprises an input layer and a shared coding layer; performing segmentation processing on the text based on the input layer to obtain at least two segmentation words; at least two of the segmentations include segment E _f F is a positive integer, and f is less than or equal to the total number corresponding to at least two segmentation words; acquiring segmentation E _f Location information in text will be for word E _f A position information input sharing coding layer; based on the shared coding layer, the pair aims at word E _f Vector encoding is carried out on the position information of the word to obtain a word E _f A corresponding shared location vector; determining position information of a first entity word in a text; the position information of the first entity word in the text belongs to the position information of at least two segmentation words in the text respectively; based on the position information of the first entity word in the text, acquiring a shared position vector corresponding to the first entity word from shared position vectors corresponding to at least two segmentation words respectively; and acquiring a conditional shared semantic vector corresponding to the first entity word from the shared semantic vector based on the shared position vector corresponding to the first entity word.

Specifically, a text recognition model is obtained; the text recognition model comprises a first coding layer; the first encoding layer includes a self-attention component, a first normalization component, a feed-forward component, and a second normalization component; inputting the shared semantic vector into a self-attention component, and carrying out vector coding on the shared semantic vector based on the self-attention component to obtain a first semantic vector to be normalized; respectively inputting a first semantic vector to be normalized and a shared semantic vector into a first normalization component, and carrying out weighted fusion on the first semantic vector to be normalized and the shared semantic vector based on the first normalization component to obtain a semantic vector to be feedforward; inputting the semantic vector to be feedforward to a feedforward component, and carrying out vector coding on the semantic vector to be feedforward based on the feedforward component to obtain a second semantic vector to be normalized; the conditional sharing semantic vector, the second semantic vector to be normalized and the semantic vector to be feedforward are respectively input into a second normalization component, and vector fusion is carried out on the conditional sharing semantic vector, the second semantic vector to be normalized and the semantic vector to be feedforward based on the second normalization component, so that a target semantic vector is obtained.

The second normalization component comprises an average sub-component, a distance sub-component, a standard sub-component, a scaling sub-component, a weighting sub-component and a fusion sub-component; the specific process of obtaining the target semantic vector may include: vector averaging is carried out on the second semantic vector to be normalized and the semantic vector to be fed forward based on the averaging subassembly, so that an average semantic vector is obtained; based on the distance sub-component, acquiring a vector distance between a second semantic vector to be normalized and an average semantic vector to obtain a first distance vector, and acquiring a vector distance between the semantic vector to be fed forward and the average semantic vector to obtain a second distance vector; based on the standard sub-assembly, vector standards are carried out on the second semantic vector to be normalized and the semantic vector to be fed forward, and a standard semantic vector is obtained; vector scaling is carried out on the first distance vector and the standard semantic vector based on the scaling sub-component to obtain a first scaling vector, and vector scaling is carried out on the second distance vector and the standard semantic vector to obtain a second scaling vector; generating a first weight feature corresponding to the conditional sharing semantic vector and a second weight feature corresponding to the conditional sharing semantic vector; based on the weighting sub-component, carrying out weighted fusion on the first scaling vector, the second scaling vector and the first weight feature to obtain a semantic vector to be fused; and carrying out vector fusion on the second weight characteristics and the semantic vectors to be fused based on the fusion sub-component to obtain target semantic vectors.

In combination with the description in the embodiment corresponding to step S102 and fig. 2, the total number of the first entity words may be one or more, if the service server extracts a plurality of first entity words from the text, one of the first entity words is used as a condition, for convenience of description, the first entity word as a condition entity word is referred to as a condition entity word, the shared position vector corresponding to the condition entity word is obtained, and the service server inputs the shared position vector corresponding to the condition entity word to the first encoding layer, and it can be understood that the first encoding layer and the second encoding layer share the shared encoding layer at the upstream, so that the first encoding layer can obtain the conditional shared semantic vector corresponding to the shared position vector corresponding to the condition entity word in the shared encoding vector corresponding to each word in the shared encoding layer.

The structures corresponding to the shared coding layer, the first coding layer and the second coding layer are the same, and may be composed of a self-attention mechanism and a feedforward neural network, please refer to fig. 4 and fig. 5, and fig. 5 is a schematic structural diagram of the first coding layer according to an embodiment of the present application. As shown in fig. 5, the service server will share the semantic vector G ₃ Input self-attention component fig. 5 illustrates a self-attention component with a multi-headed self-attention mechanism, i.e. for each word, the word is expressed by other words in a sentence, wherein the expression weights of the other words for the word are different from each other, while a multi-headed self-attention mechanism, i.e. a plurality of attention mechanisms, can capture the relationships at different abstract levels from different angles, and the multi-headed self-attention mechanism can be represented by formula (2).

MultiHead(Q,K,V)＝Concat(Head ₁ ；Head ₂ ；…；Head _n )W ⁰ (2)

Head _i ＝Attention(QW _i ^Q ,KW _i ^K ,VW _i ^V ) (3)

Wherein Q, K and V are all high-dimensional expressions output by the upper layer, and Q, K is used for learning importance expression of other words in the sentence to the word in a self-attention mechanism and then is matched with the wordV point multiplication is carried out, so that the representation of other words in the sentence on the word can be obtained; head in equation (3) _i Represents the ith self-attention mechanism, n represents the number of self-attention mechanisms, W _i ^Q ,W _i ^K ,W _i ^V Respectively represent the weights corresponding to Q, K and V, W ⁰ Representing the splice weights of the multi-headed self-attention mechanism.

Further, the business server shares semantic vector G based on the self-attention component pair ₃ Vector coding is carried out to obtain a first semantic vector to be normalized; the first semantic vector to be normalized and the shared semantic vector G ₃ Respectively inputting a first normalization component, as shown in fig. 5, wherein the component structure of the first normalization component is identical to the component structure of the second normalization component, and the difference between the two components is that the processed data are different, and the input W of the first normalization component comprises a first semantic vector to be normalized and a shared semantic vector G ₃ The component structure of the first normalized component is not described herein, and reference may be made to the description of the second normalized component below, because the weight parameters in the text recognition model are both β and γ. The business server performs normalization on the first semantic vector to be normalized and the shared semantic vector G based on the first normalization component ₃ And (3) carrying out weighted fusion to obtain a semantic vector W' to be fed forward, wherein the process can be represented by a formula (4).

The meaning of the components in formula (4) is explained in the following formula (5), and will not be described here.

The business server inputs the semantic vector W 'to be fed forward to the feedforward component, and vector encoding is carried out on the semantic vector W' to be fed forward based on the feedforward component to obtain a second semantic vector to be normalized; in conjunction with fig. 4 and 5, embodiments of the present application share semantic vector G ₁ Sharing semantic vector G ₂ The service server respectively outputs the conditional sharing semantic vector, the second semantic vector to be normalized and the semantic vector to be feedforward W' by using the example conditional sharing semantic vectorThe input U of the second normalization component comprises a second semantic vector to be normalized and a semantic vector W 'to be fed forward, wherein the second normalization component comprises an average sub-component, a distance sub-component 50a, a standard sub-component, a scaling sub-component 50d, a weighting sub-component 50b and a fusion sub-component 50c, and the service server carries out vector average on the second semantic vector to be normalized and the semantic vector W' to be fed forward based on the average sub-component to obtain an average semantic vector; based on the distance sub-component 50a, obtaining a vector distance between a second semantic vector to be normalized and an average semantic vector to obtain a first distance vector, and obtaining a vector distance between a semantic vector W' to be fed forward and the average semantic vector to obtain a second distance vector; vector standards are carried out on the second semantic vector to be normalized and the semantic vector W' to be fed forward based on the standard sub-assembly, and a standard semantic vector is obtained; the service server performs vector scaling on the first distance vector and the standard semantic vector based on the scaling sub-component 50d to obtain a first scaling vector, and performs vector scaling on the second distance vector and the standard semantic vector to obtain a second scaling vector; as shown in fig. 5, based on the weight matrix C ₁ Generating conditional shared semantic vectors (i.e. shared semantic vector G ₁ Sharing semantic vector G ₂ ) Corresponding first weight characteristics, and the service server is based on the weight matrix C ₂ Generating a second weight feature corresponding to the conditional sharing semantic vector; based on the weighting sub-component, carrying out weighted fusion on the first scaling vector, the second scaling vector and the first weight feature to obtain a semantic vector to be fused; based on the fusion sub-component, vector fusion is carried out on the second weight characteristic and the semantic vector to be fused to obtain a shared semantic vector G ₃ Corresponding target semantic vector M ₃ . The above procedure can be expressed by the formula (5).

Wherein U' may represent a shared semantic vector G ₃ Corresponding target semantic vector M ₃ ；C ₁ (G ₁ ,G ₂ ) Representing a first weight characteristic, C ₂ (G ₁ ,G ₂ ) Representing a second weight feature, avg represents an average sub-component and Std represents a standard sub-component. Obviously, the present application may constitute a conditional normalization layer for fitting to a given subject (e.g., the first entity word X illustrated in fig. 4 ₁ X ₂ ) The probabilities of objects and relationships are extracted.

FIG. 5 is a graph with shared semantic vector G ₃ Target semantic vector M ₃ In an example description, the service server performs a vector fusion on the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector, please refer to fig. 4 again, and according to the above process, a shared semantic vector G may also be obtained ₁ Corresponding target semantic vector M ₁ Sharing semantic vector G ₂ Corresponding target semantic vector M ₂ …, shared semantic vector G _n Corresponding target semantic vector M _n 。

It will be appreciated that the shared encoding layer may learn a representation of the vocabulary hierarchy, the first encoding layer and the second encoding layer may learn a representation of the semantic hierarchy, and the pre-trained word vectors may overcome complications such as ambiguous words, i.e., word vectors of the same word in different contexts are also different.

Step S104, carrying out vector recognition on the target semantic vector to obtain a triplet comprising a first entity word, a second entity word belonging to a second part of speech and a relation entity word; the second entity word in the triplet belongs to text; the relationship entity words in the triples are used for representing the association relationship between the first entity words and the second entity words in the triples.

Specifically, a text recognition model is obtained; the text recognition model comprises a relation recognition layer; inputting the target semantic vector into a relation recognition layer, and carrying out vector recognition on the target semantic vector based on the relation recognition layer to obtain a recognition semantic vector; based on the recognition semantic vector, a triplet is generated that includes the first entity word, the second entity word that belongs to the second part of speech, and the relationship entity word.

The relationship extraction refers to identifying the relationship between the extracted entity A and the entity B, and if the entity A, the relationship and the entity B just can form the relationship of a main guest, the relationship is also called (entity A, relationship and entity B) as a triplet, wherein the entity A is a subject and the entity B is a guest. The subject refers to a carrier of practice activities and cognitive activities; objects refer to subjects of both physical and cognitive activities.

Referring to fig. 4 again, it can be seen from the description of step S103 that the second recognition network 203b (including the first coding layer and the relationship recognition layer in fig. 4) shown in fig. 2 may be an object and relationship extraction model based on conditional probability, and the present application obtains the probability that each word in the sequence is extracted as a second entity word (belonging to the second part of speech) of a certain relationship by adding the identified first entity word as a conditional input. It should be clear that the first coding layer and the second coding layer share the shared coding layer upstream and learn the common semantic information.

After conditional normalization layer results are obtained, the target semantic vector M as illustrated in FIG. 4 ₁ Target semantic vector M ₂ Target semantic vector M ₃ …, target semantic vector M _n The business server inputs the target semantic vectors corresponding to each word into the relation recognition layer respectively, carries out vector recognition on the target semantic vectors based on the relation recognition layer to obtain recognition semantic vectors, and generates a triplet comprising the first entity word, the second entity word belonging to the second part of speech and the relation entity word based on the recognition semantic vectors. The relationship recognition layer provided in the embodiment of the present application may include one or more two classifiers to extract the second entity word and the relationship entity word, where each two classifier characterizes a relationship, and as illustrated in fig. 4, the relationship recognition layer includes 3 two classifiers that respectively represent the relationship entity word S ₃ Characterized relationship, relationship entity word S ₂ Characterized relationship, relationship entity word S ₁ The number of the two classifiers is not limited by the represented relation, and the relation can be set according to actual application scenes.

For each bi-classifier, a 0, 1 tag is output for each word in the sequence, where 1 indicates that this word is givenWhen determining the first entity word, the second entity word including the relationship, as illustrated in FIG. 4, is related to the entity word S ₃ The corresponding classifier output triplet (X ₁ X ₂ ，S ₃ ，X ₆ X ₇ ) The triplet may represent a first entity word X ₁ X ₂ And the second entity word X ₆ X ₇ Having relational entity words S ₃ The represented association relationship; relation entity words S ₂ The corresponding classifier output triplet (X ₁ X ₂ ，S ₂ ，X ₃ ) The triplet may represent a first entity word X ₁ X ₂ And the second entity word X ₃ Having relational entity words S ₂ The represented association relationship; relation entity words S ₁ The corresponding classifier output triplet (X ₁ X ₂ ，S ₁ ，X _n-1 X _n ) The triplet may represent a first entity word X ₁ X ₂ And the second entity word X _n-1 X _n Having relational entity words S ₁ The characterized association relation.

In the embodiment of the application, a service server can acquire a first entity word belonging to a first part of speech in a text by acquiring a shared semantic vector corresponding to each word in the text, further acquire a conditional shared semantic vector corresponding to the first entity word in the shared semantic vector, and perform vector fusion on the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector; and carrying out vector recognition on the target semantic vector to obtain a triplet comprising the first entity word, the second entity word belonging to the second part of speech and the relation entity word, wherein the relation entity word in the triplet can be used for representing the association relation between the first entity word and the second entity word in the triplet. As can be seen from the foregoing, the present application first obtains a first entity word in a text, inputs a conditional shared semantic vector for the first entity word as a condition, and fuses the conditional shared semantic vector with the shared semantic vector, so that the target semantic vector is a semantic vector including a condition, that is, the first entity word is identified as a condition, so that under the condition of the given first entity word, a second entity word in the text and the first entity word have an association relationship pointed by a relationship entity word, and therefore, the present application embodiment can improve the identification rate of the association relationship between the first entity word and the second entity word.

Referring to fig. 6, fig. 6 is a flowchart of a data processing method according to an embodiment of the present application. The method may be performed by a service server (e.g., the service server 100 shown in fig. 1 and described above), by a terminal device (e.g., the terminal device 200a shown in fig. 1 and described above), or by both the service server and the terminal device. For ease of understanding, embodiments of the present application will be described with this method being performed by a service server as an example. As shown in fig. 6, the method may include at least the following steps.

Step S201, acquiring a text, and acquiring shared semantic vectors corresponding to each word in the text.

Step S202, based on the shared semantic vector, a first entity word belonging to a first part of speech in the text is obtained.

Step S203, a conditional shared semantic vector corresponding to the first entity word is obtained from the shared semantic vector, and vector fusion is carried out on the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector.

Step S204, carrying out vector recognition on the target semantic vector to obtain a triplet comprising a first entity word, a second entity word belonging to a second part of speech and a relation entity word; the second entity word in the triplet belongs to text; the relationship entity words in the triples are used for representing the association relationship between the first entity words and the second entity words in the triples.

In the specific implementation process of step S201 to step S204, please refer to step S101 to step S104 in the embodiment corresponding to fig. 3, which is not described herein.

Step S205, determining triplet G _h And if the association relationship attribute is a negative association relationship, determining the object represented by the first entity word as a target object.

Specifically, the triples include at least two triples for the first entity wordThe method comprises the steps of carrying out a first treatment on the surface of the At least two triples comprising triplet G _h H is a positive integer and h is less than or equal to the total number of at least two triples. It can be understood that the embodiment of the application does not limit the association relationship attribute, and can be set according to the actual application scenario. One scenario that may be implemented is to determine negative companies that have abnormal activity, so the classifier in the relationship identification layer may include a classifier for identifying negative associations, and negative entity words may include, but are not limited to, impersonation, suspicion, etc.

By adopting the method and the device, the service server can generate one or more triples for the text, the number of the triples is not limited, and the configuration can be carried out according to the actual application scene.

Step S206, obtaining triples including short relation entity words from at least two triples, and determining second entity words in the triples including short relation entity words as short entity words; the object represented by the entity word for short is equal to the target object; the relationship entity word refers to the relationship entity word with the attribute of the characterized association relationship called the association relationship for short.

Step S207, obtaining triples including attribution relation entity words from at least two triples, and determining second entity words in the triples including attribution relation entity words as attribution entity words; the object characterized by the attribution entity word is attributed to the target object; the attribution relation entity word refers to the relation entity word with the attribute of the characterized association relation as attribution relation.

Step S208, associating and storing the triplet G _h The relationship entity word, the attribution relationship entity word and the attribution entity word.

Specifically, in combination with the description of step S206 and step S208, if the association relationship attribute in at least two triples (for the same first entity word) is a negative association relationship, it is determined that the target object represented by the first entity word has abnormal activity, at this time, the service server may determine whether there are triples including a simply related entity word and triples including a attributive related entity word in the remaining triples; if the triples including the attribution relation entity words exist, storing the triples including the negative association relation, the attribution relation entity words and the attribution entity words in an association mode; if the triples including the short relation entity words exist, the triples including the negative association relation, the short relation entity words and the short entity words are stored in an associated mode; if the triples including the attribution relation entity words and the triples including the short relation entity words exist, the triples including the negative association relation, the short relation entity words, the short entity words, the attribution relation entity words and the attribution entity words are stored in an association mode.

Subsequently, when an associated stored entity word is detected, the business server may query other entity words associated with the entity word.

In summary, the embodiment of the application can improve the accuracy of identifying the relationship between the first entity word and the second entity word, extract the target object truly having negative influence, eliminate the main body for publicizing and preventing the abnormal activity or denying the abnormal activity, and reduce misjudgment; secondly, when a plurality of first entity words appear in the text, the embodiment of the application can correspond the first entity words corresponding to the target object and the negative vocabulary, and the non-negative first entity words are not extracted, so that misjudgment is reduced; furthermore, other short names, sub-objects and other information related to the target object in the text are also extracted, so that the matching of the subsequent internal objects is an important supplement.

Referring to fig. 7, fig. 7 is a flowchart of a data processing method according to an embodiment of the present application. The method may be performed by a service server (e.g., the service server 100 shown in fig. 1 and described above), by a terminal device (e.g., the terminal device 200a shown in fig. 1 and described above), or by both the service server and the terminal device. For ease of understanding, embodiments of the present application will be described with this method being performed by a service server as an example. As shown in fig. 7, the method may include at least the following steps.

Step S301, a training sample set is obtained; the training sample set comprises sample text, a first tag entity word in the sample text and a tag triplet associated with the sample text; the part of speech of the first tag entity word belongs to the first part of speech; the tag triples comprise a first tag entity word, a second tag entity word belonging to a second entity word and a tag relation entity word; the first part of speech is different from the second part of speech; the second tag entity word in the tag triplet belongs to the sample text; the label relation entity words in the label triples are used for representing association relations between the first label entity words and the second label entity words in the label triples.

Step S302, inputting the sample text into a text recognition initial model, and acquiring prediction sharing semantic vectors corresponding to each sample word in the sample text in the text recognition initial model.

Step S303, based on the prediction sharing semantic vector, a first prediction entity word in the sample text is obtained.

Step S304, a prediction condition sharing semantic vector corresponding to the first prediction entity word is obtained from the prediction sharing semantic vector, and vector fusion is carried out on the prediction condition sharing semantic vector and the prediction sharing semantic vector to obtain a prediction target semantic vector.

Step S305, carrying out vector recognition on the predicted target semantic vector to obtain a predicted triplet comprising a first predicted entity word, a second predicted entity word and a predicted relation entity word; the second predicted entity word in the predicted triplet belongs to the sample text.

Step S306, parameters in the initial text recognition model are adjusted according to the first predicted entity word, the first tag entity word, the predicted triplet and the tag triplet, and a text recognition model is generated; the text recognition model is used to generate triples for text.

Specifically, generating an entity loss value according to the first predicted entity word and the first tag entity word; generating a relation loss value according to the predicted triplet and the label triplet; determining a total loss value corresponding to the text recognition initial model according to the entity loss value and the relation loss value; and adjusting parameters in the text recognition initial model according to the total loss value to generate a text recognition model.

Specifically, the training process may refer to the description of the application process in the embodiment corresponding to fig. 3, which is not described herein.

As can be seen from the foregoing, the present application firstly obtains the first predicted entity word in the sample text, inputs the shared semantic vector of the prediction condition for the first predicted entity word as a condition, and fuses the shared semantic vector with the prediction, so that the prediction target semantic vector is a semantic vector including a condition, that is, the first predicted entity word that is to be identified is used as a condition, under the condition that the first predicted entity word is given, the second predicted entity word in the sample text and the first predicted entity word have the association relationship pointed by the tag relationship entity word, and adjusts the parameters in the text identification initial model by using the first predicted entity word, the first tag entity word, the prediction triplet and the tag triplet, so that the present application embodiment can improve the identification rate of the association relationship between the first entity word and the second entity word in the text.

Further, referring to fig. 8, fig. 8 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The data processing means may be a computer program (comprising program code) running in a computer device, for example the data processing means is an application software; the device can be used for executing corresponding steps in the method provided by the embodiment of the application. As shown in fig. 8, the data processing apparatus 1 may include: a first acquisition module 11, a second acquisition module 12, a first generation module 13 and a second generation module 14.

The first obtaining module 11 is configured to obtain a text, and obtain a shared semantic vector corresponding to each word in the text;

a second obtaining module 12, configured to obtain, based on the shared semantic vector, a first entity word belonging to the first part of speech in the text;

the first generating module 13 is configured to obtain a conditional shared semantic vector corresponding to the first entity word from the shared semantic vectors, and perform vector fusion on the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector;

a second generating module 14, configured to perform vector recognition on the target semantic vector, so as to obtain a triplet including a first entity word, a second entity word belonging to a second part of speech, and a relational entity word; the second entity word in the triplet belongs to text; the relationship entity words in the triples are used for representing the association relationship between the first entity words and the second entity words in the triples.

The specific functional implementation manner of the first acquiring module 11, the second acquiring module 12, the first generating module 13, and the second generating module 14 may refer to step S101 to step S104 in the corresponding embodiment of fig. 3, and will not be described herein.

Referring again to fig. 8, the data processing apparatus 1 may further include: a third acquisition module 15 and a third generation module 16.

The first obtaining module 11 is further configured to obtain a text recognition model, and input text to the text recognition model; the text recognition model comprises an input layer and a shared coding layer;

the first obtaining module 11 further includes performing segmentation processing on the text based on the input layer to obtain at least two segmentation words; at least two of the segmentations include segment E _f F is a positive integer, and f is less than or equal to the total number corresponding to at least two segmentation words;

a third obtaining module 15 for obtaining the segmentation E _f Location information in text will be for word E _f A position information input sharing coding layer;

a third generation module 16 for generating a word E based on the shared coding layer _f Vector encoding is carried out on the position information of the word to obtain a word E _f A corresponding shared location vector;

the first generation module 13 may include: a position determining unit 131, a first acquiring unit 132, and a second acquiring unit 133.

A determining location unit 131, configured to determine location information of the first entity word in the text; the position information of the first entity word in the text belongs to the position information of at least two segmentation words in the text respectively;

a first obtaining unit 132, configured to obtain, based on the position information of the first entity word in the text, a shared position vector corresponding to the first entity word from shared position vectors corresponding to at least two segmentation words respectively;

The second obtaining unit 133 is configured to obtain, from the shared semantic vectors, a conditional shared semantic vector corresponding to the first entity word based on the shared location vector corresponding to the first entity word.

The specific functional implementation manners of the first acquiring module 11, the third acquiring module 15, the third generating module 16, the determining position unit 131, the first acquiring unit 132, and the second acquiring unit 133 may be referred to the step S103 in the corresponding embodiment of fig. 3, and will not be described herein.

Referring again to fig. 8, the first generating module 13 may include: a third acquisition unit 134, a first input unit 135, a second input unit 136, a third input unit 137, and a fourth input unit 138.

A third acquisition unit 134 for acquiring a text recognition model; the text recognition model comprises a first coding layer; the first encoding layer includes a self-attention component, a first normalization component, a feed-forward component, and a second normalization component;

a first input unit 135, configured to input the shared semantic vector into the self-attention component, and perform vector encoding on the shared semantic vector based on the self-attention component to obtain a first semantic vector to be normalized;

the second input unit 136 is configured to input the first semantic vector to be normalized and the shared semantic vector into the first normalization component, and perform weighted fusion on the first semantic vector to be normalized and the shared semantic vector based on the first normalization component to obtain a semantic vector to be feedforward;

A third input unit 137, configured to input the semantic vector to be feedforward to the feedforward component, and perform vector encoding on the semantic vector to be feedforward based on the feedforward component to obtain a second semantic vector to be normalized;

the fourth input unit 138 is configured to input the conditional sharing semantic vector, the second semantic vector to be normalized, and the semantic vector to be feedforward into the second normalization component, and perform vector fusion on the conditional sharing semantic vector, the second semantic vector to be normalized, and the semantic vector to be feedforward based on the second normalization component, to obtain a target semantic vector.

The specific functional implementation manner of the third obtaining unit 134, the first input unit 135, the second input unit 136, the third input unit 137 and the fourth input unit 138 may be referred to the step S103 in the corresponding embodiment of fig. 3, and will not be described herein.

Referring again to fig. 8, the second normalization component includes an averaging sub-component, a distance sub-component, a standard sub-component, a scaling sub-component, a weighting sub-component, and a fusion sub-component;

the fourth input unit 138 may include: the first generation subunit 1381, the second generation subunit 1382, the third generation subunit 1383, the fourth generation subunit 1384, the fifth generation subunit 1385, the sixth generation subunit 1386, and the seventh generation subunit 1387.

A first generating subunit 1381, configured to perform vector average on the second semantic vector to be normalized and the semantic vector to be feedforward based on the averaging subassembly, to obtain an average semantic vector;

a second generating subunit 1382, configured to obtain, based on the distance subassembly, a vector distance between the second semantic vector to be normalized and the average semantic vector, obtain a first distance vector, and obtain a vector distance between the semantic vector to be feedforward and the average semantic vector, so as to obtain a second distance vector;

a third generating subunit 1383, configured to perform vector standard on the second semantic vector to be normalized and the semantic vector to be feedforward based on the standard subassembly, to obtain a standard semantic vector;

a fourth generating subunit 1384, configured to perform vector scaling on the first distance vector and the standard semantic vector based on the scaling subassembly to obtain a first scaled vector, and perform vector scaling on the second distance vector and the standard semantic vector to obtain a second scaled vector;

a fifth generating subunit 1385, configured to generate a first weight feature corresponding to the conditional shared semantic vector and a second weight feature corresponding to the conditional shared semantic vector;

a sixth generating sub-sheet 1386, configured to perform weighted fusion on the first scaling vector, the second scaling vector and the first weight feature based on the weighting sub-component, so as to obtain a semantic vector to be fused;

A seventh generating subunit 1387 is configured to perform vector fusion on the second weight feature and the semantic vector to be fused based on the fusion subassembly, to obtain a target semantic vector.

The specific functional implementation manners of the first generating subunit 1381, the second generating subunit 1382, the third generating subunit 1383, the fourth generating subunit 1384, the fifth generating subunit 1385, the sixth generating subunit 1386 and the seventh generating subunit 1387 may be referred to in step S103 in the corresponding embodiment of fig. 3, and will not be described herein again.

Referring again to fig. 8, the second generating module 14 may include: the fourth acquisition unit 141 and the first generation unit 142.

A fourth acquisition unit 141 for acquiring a text recognition model; the text recognition model comprises a relation recognition layer;

the fourth obtaining unit 141 is further configured to input the target semantic vector into a relationship recognition layer, and perform vector recognition on the target semantic vector based on the relationship recognition layer to obtain a recognition semantic vector;

the first generating unit 142 is configured to generate, based on the recognition semantic vector, a triplet including a first entity word, a second entity word belonging to the second part of speech, and a relationship entity word.

The specific functional implementation manner of the fourth obtaining unit 141 and the first generating unit 142 may refer to step S104 in the corresponding embodiment of fig. 3, which is not described herein.

Referring again to fig. 8, the first acquisition module 11 may include: a fifth acquisition unit 111, a second generation unit 112, and a third generation unit 113.

A fifth acquisition unit 111 for acquiring a text recognition model, and inputting text to the text recognition model; the text recognition model comprises an input layer and a shared coding layer;

a second generating unit 112, configured to perform segmentation processing on the text based on the input layer, obtain at least two segmentation words, and input the at least two segmentation words into the shared coding layer respectively;

the third generating unit 113 is configured to perform vector encoding on at least two words based on the shared encoding layer, so as to obtain a shared semantic vector corresponding to each word.

The specific functional implementation manner of the fifth obtaining unit 111, the second generating unit 112, and the third generating unit 113 may refer to step S101 in the corresponding embodiment of fig. 3, which is not described herein.

Referring again to fig. 8, the second acquisition module 12 may include: a sixth acquisition unit 121, a fifth input unit 122, a sixth input unit 123, and a seventh input unit 124.

A sixth acquisition unit 121 for acquiring a text recognition model; the text recognition model comprises a second coding layer, an entity recognition layer and a decoding layer;

A fifth input unit 122, configured to input the shared semantic vector into a second encoding layer, and perform vector encoding on the shared semantic vector based on the second encoding layer to obtain a semantic vector to be identified;

a sixth input unit 123, configured to input the semantic vector to be identified into the entity identification layer, and perform vector identification on the semantic vector to be identified based on the entity identification layer, so as to obtain a semantic vector to be decoded, which is used for characterizing the first entity word;

the seventh input unit 124 is configured to input the semantic vector to be decoded for representing the first entity word into a decoding layer, and perform vector decoding on the semantic vector to be decoded for representing the first entity word based on the decoding layer, so as to obtain the first entity word belonging to the first part of speech in the text.

The specific functional implementation manner of the sixth obtaining unit 121, the fifth input unit 122, the sixth input unit 123, and the seventh input unit 124 may refer to step S102 in the corresponding embodiment of fig. 3, and will not be described herein.

Referring again to fig. 8, the semantic vector to be identified includes semantic vector a to be identified _b B is a positive integer, and b is less than or equal to the total number corresponding to the semantic vectors to be identified; the entity recognition layer comprises a semantic vector A aiming at to-be-recognized _b Entity identification component C of (a) _b ；

The sixth input unit 123 may include: the eighth generation subunit 1231, the ninth generation subunit 1232, and the tenth generation subunit 1233.

An eighth generating subunit 1231 for identifying the component C if an entity _b For the first entity recognition component in the entity recognition layer, then at entity recognition component C _b Semantic vector A to be recognized _b Vector recognition is carried out to obtain a semantic vector A to be recognized _b Corresponding semantic vector D to be decoded _b ；

A ninth generation subunit 1232 for identifying component C if an entity _b Not the first entity identification component in the entity identification layer, then at entity identification component C _b In the semantic vector A to be recognized _b And semantic vector A to be identified _b Presence bitVector fusion is carried out on target semantic vectors to be identified of the incidence relation to obtain semantic vector A to be identified _b Corresponding semantic vector D to be decoded _b The method comprises the steps of carrying out a first treatment on the surface of the The target semantic vector to be identified belongs to the semantic vector to be identified;

tenth generation subunit 1233 is configured to obtain, based on the semantic vectors to be decoded corresponding to each semantic vector to be identified, a semantic vector to be decoded for characterizing the first entity word.

The specific functional implementation manner of the eighth generating subunit 1231, the ninth generating subunit 1232, and the tenth generating subunit 1233 may refer to step S102 in the corresponding embodiment of fig. 3, which is not described herein.

Referring again to fig. 8, the triples include at least two triples for the first entity word; at least two triples comprising triplet G _h H is a positive integer, and h is less than or equal to the total number of at least two triples;

the data processing apparatus 1 may further include: a first determination module 16, a second determination module 17, a second determination module 18, and an associated storage module 19.

A first determining module 16 for determining the triplet G _h The association relationship attribute represented by the relationship entity word in the list is a negative association relationship, and if the association relationship attribute is a negative association relationship, the object represented by the first entity word is determined to be a target object;

a second determining module 17, configured to obtain triples including abbreviated relationship entity words from at least two triples, and determine a second entity word in the triples including abbreviated relationship entity words as an abbreviated entity word; the object represented by the entity word for short is equal to the target object; the relationship entity word for short refers to the relationship entity word with the attribute of the characterized association relationship being the association relationship for short;

a second determining module 18, configured to obtain triples including the attribution relation entity words from at least two triples, and determine a second entity word in the triples including the attribution relation entity words as an attribution entity word; the object characterized by the attribution entity word is attributed to the target object; the attribution relation entity word refers to the relation entity word with the attribute of the characterized association relation as attribution association relation;

An association storage module 19 for associating and storing the triples G _h The relationship entity word, the attribution relationship entity word and the attribution entity word.

The specific functional implementation manners of the first determining module 16, the second determining module 17, the second determining module 18, and the associated storage module 19 may refer to step S205-step S208 in the corresponding embodiment of fig. 6, which are not described herein.

As can be seen from the foregoing, the present application first obtains a first entity word in a text, inputs a conditional shared semantic vector for the first entity word as a condition, and fuses the conditional shared semantic vector with the shared semantic vector, so that the target semantic vector is a semantic vector including a condition, that is, the first entity word is identified as a condition, so that under the condition of the given first entity word, a second entity word in the text and the first entity word have an association relationship pointed by a relationship entity word, and therefore, the present application embodiment can improve the identification rate of the association relationship between the first entity word and the second entity word.

Further, referring to fig. 9, fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The data processing means may be a computer program (comprising program code) running in a computer device, for example the data processing means is an application software; the device can be used for executing corresponding steps in the method provided by the embodiment of the application. As shown in fig. 9, the data processing apparatus 2 may include: a first acquisition module 21, a second acquisition module 22, a first generation module 23, a second generation module 24, and a third generation module 25.

A first acquisition module 21 for acquiring a training sample set; the training sample set comprises sample text, a first tag entity word in the sample text and a tag triplet associated with the sample text; the part of speech of the first tag entity word belongs to the first part of speech; the tag triples comprise a first tag entity word, a second tag entity word belonging to a second entity word and a tag relation entity word; the first part of speech is different from the second part of speech; the second tag entity word in the tag triplet belongs to the sample text; the label relation entity words in the label triples are used for representing association relations between the first label entity words and the second label entity words in the label triples;

the first obtaining module 21 is further configured to input a sample text into a text recognition initial model, where a prediction sharing semantic vector corresponding to each sample word in the sample text is obtained;

a second obtaining module 22, configured to obtain a first predicted entity word in the sample text based on the predicted shared semantic vector;

the first generating module 23 is configured to obtain a prediction condition sharing semantic vector corresponding to the first predicted entity word from the prediction sharing semantic vector, and perform vector fusion on the prediction condition sharing semantic vector and the prediction sharing semantic vector to obtain a prediction target semantic vector;

The second generating module 24 is configured to perform vector recognition on the prediction target semantic vector to obtain a prediction triplet including a first prediction entity word, a second prediction entity word, and a prediction relationship entity word; the second predicted entity word in the predicted triplet belongs to the sample text;

the third generating module 25 is configured to adjust parameters in the initial text recognition model according to the first predicted entity word, the first tag entity word, the predicted triplet, and the tag triplet, and generate a text recognition model; the text recognition model is used to generate triples for text.

The specific functional implementation manner of the first obtaining module 21, the second obtaining module 22, the first generating module 23, the second generating module 24, and the third generating module 25 may refer to step S301-step 306 in the corresponding embodiment of fig. 7, and will not be described herein.

Referring to fig. 9 again, the third generating module 25 includes a first generating unit 251, a second generating unit 252, and a third generating unit 253;

a first generating unit 251, configured to generate an entity loss value according to the first predicted entity word and the first tag entity word;

a second generating unit 252, configured to generate a relationship loss value according to the prediction triplet and the tag triplet;

A third generating unit 253, configured to determine a total loss value corresponding to the text recognition initial model according to the entity loss value and the relationship loss value;

the third generating unit 253 is further configured to adjust parameters in the initial text recognition model according to the total loss value, and generate a text recognition model.

The specific functional implementation manner of the first generating unit 251, the second generating unit 252 and the third generating unit 253 may refer to step S306 in the corresponding embodiment of fig. 7, which is not described herein.

Further, referring to fig. 10, fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 10, the computer device 1000 may include: at least one processor 1001, such as a CPU, at least one network interface 1004, a user interface 1003, a memory 1005, at least one communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display (Display), a Keyboard (Keyboard), and the network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may also optionally be at least one storage device located remotely from the aforementioned processor 1001. As shown in fig. 10, the memory 1005, which is one type of computer storage medium, may include an operating system, a network communication module, a user interface module, and a device control application.

In the computer device 1000 shown in FIG. 10, the network interface 1004 may provide network communication functions; while user interface 1003 is primarily used as an interface for providing input to a user; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement:

It should be understood that the computer device 1000 described in the embodiments of the present application may perform the description of the data processing method in the embodiments corresponding to fig. 3, 6 and 7, and may also perform the description of the data processing apparatus 1 in the embodiments corresponding to fig. 8, which are not described herein. In addition, the description of the beneficial effects of the same method is omitted.

Further, referring to fig. 11, fig. 11 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 11, the computer device 2000 may include: processor 2001, network interface 2004 and memory 2005, in addition, the above-described computer device 2000 may further include: a user interface 2003, and at least one communication bus 2002. Wherein a communication bus 2002 is used to enable connected communications between these components. The user interface 2003 may include a Display screen (Display), a Keyboard (Keyboard), and the optional user interface 2003 may further include a standard wired interface, a wireless interface, among others. The network interface 2004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 2005 may be a high-speed RAM memory or a nonvolatile memory (non-volatile memory), such as at least one magnetic disk memory. The memory 2005 may also optionally be at least one storage device located remotely from the aforementioned processor 2001. As shown in fig. 11, an operating system, a network communication module, a user interface module, and a device control application program may be included in the memory 2005 as one type of computer-readable storage medium.

In the computer device 2000 illustrated in fig. 11, the network interface 2004 may provide network communication functions; while user interface 2003 is primarily an interface for providing input to a user; and processor 2001 may be used to invoke device control applications stored in memory 2005 to implement:

It should be understood that the computer device 2000 described in the embodiments of the present application may perform the description of the data processing method in the embodiments corresponding to fig. 3, 6 and 7, and may also perform the description of the data processing apparatus 2 in the embodiments corresponding to fig. 9, which are not described herein. In addition, the description of the beneficial effects of the same method is omitted.

The embodiment of the present application further provides a computer readable storage medium, where a computer program is stored, where the computer program includes program instructions, and when executed by a processor, implement the data processing method provided by each step in fig. 3, fig. 6, and fig. 7, and specifically refer to the implementation manner provided by each step in fig. 3, fig. 6, and fig. 7, which is not described herein again. In addition, the description of the beneficial effects of the same method is omitted.

The computer readable storage medium may be the data processing apparatus provided in any one of the foregoing embodiments or an internal storage unit of the computer device, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash card (flash card) or the like, which are provided on the computer device. Further, the computer-readable storage medium may also include both internal storage units and external storage devices of the computer device. The computer-readable storage medium is used to store the computer program and other programs and data required by the computer device. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device can execute the description of the data processing method in the embodiments corresponding to fig. 3, fig. 6, and fig. 7, which are not described herein. In addition, the description of the beneficial effects of the same method is omitted.

The terms first, second and the like in the description and in the claims and drawings of the embodiments of the present application are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the term "include" and any variations thereof is intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or device that comprises a list of steps or elements is not limited to the list of steps or modules but may, in the alternative, include other steps or modules not listed or inherent to such process, method, apparatus, article, or device.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The methods and related devices provided in the embodiments of the present application are described with reference to the method flowcharts and/or structure diagrams provided in the embodiments of the present application, and each flowchart and/or block of the method flowcharts and/or structure diagrams may be implemented by computer program instructions, and combinations of flowcharts and/or blocks in the flowchart and/or block diagrams. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or structural diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or structures.

The foregoing disclosure is only illustrative of the preferred embodiments of the present application and is not intended to limit the scope of the claims herein, as the equivalent of the claims herein shall be construed to fall within the scope of the claims herein.

Claims

1. A method of data processing, comprising:

vector recognition is carried out on the target semantic vector, and a triplet comprising the first entity word, the second entity word belonging to the second part of speech and the relation entity word is obtained; the second entity word in the triplet belongs to the text; the relationship entity words in the triples are used for representing the association relationship between the first entity words and the second entity words in the triples.

2. The method according to claim 1, wherein the method further comprises:

Acquiring a text recognition model, and inputting the text into the text recognition model; the text recognition model comprises an input layer and a shared coding layer;

performing segmentation processing on the text based on the input layer to obtain at least two segmentation words; the at least two segmentations include segment E _f F is a positive integer, and f is less than or equal to the total number corresponding to the at least two segmentation words;

acquiring the segmentation E _f Location information in the text will be for the word segment E _f Is input to the shared coding layer;

based on the shared coding layer, the word E is segmented _f Vector encoding is carried out on the position information of the word segmentation E _f A corresponding shared location vector;

the obtaining the conditional shared semantic vector corresponding to the first entity word in the shared semantic vector includes:

determining position information of the first entity word in the text; the position information of the first entity word in the text belongs to the position information of the at least two segmentation words in the text respectively;

based on the position information of the first entity word in the text, acquiring a shared position vector corresponding to the first entity word from the shared position vectors respectively corresponding to the at least two segmentation words;

And acquiring the conditional shared semantic vector corresponding to the first entity word from the shared semantic vector based on the shared position vector corresponding to the first entity word.

3. The method of claim 1, wherein vector fusing the conditional shared semantic vector and the shared semantic vector to obtain a target semantic vector comprises:

acquiring a text recognition model; the text recognition model includes a first coding layer; the first encoding layer comprises a self-attention component, a first normalization component, a feedforward component and a second normalization component;

inputting the shared semantic vector into the self-attention component, and carrying out vector coding on the shared semantic vector based on the self-attention component to obtain a first semantic vector to be normalized;

respectively inputting the first semantic vector to be normalized and the shared semantic vector into the first normalization component, and carrying out weighted fusion on the first semantic vector to be normalized and the shared semantic vector based on the first normalization component to obtain a semantic vector to be feedforward;

inputting the semantic vector to be fed forward to the feedforward component, and carrying out vector coding on the semantic vector to be fed forward based on the feedforward component to obtain a second semantic vector to be normalized;

And respectively inputting the condition sharing semantic vector, the second semantic vector to be normalized and the semantic vector to be feedforward into the second normalization component, and carrying out vector fusion on the condition sharing semantic vector, the second semantic vector to be normalized and the semantic vector to be feedforward based on the second normalization component to obtain the target semantic vector.

4. The method of claim 3, wherein the second normalization component comprises an averaging sub-component, a distance sub-component, a standard sub-component, a scaling sub-component, a weighting sub-component, and a fusion sub-component;

the vector fusion is performed on the conditional sharing semantic vector, the second semantic vector to be normalized and the semantic vector to be feedforward based on the second normalization component to obtain the target semantic vector, and the method comprises the following steps:

vector averaging is carried out on the second semantic vector to be normalized and the semantic vector to be feedforward based on the averaging subassembly, so that an average semantic vector is obtained;

based on the distance sub-component, acquiring a vector distance between the second semantic vector to be normalized and the average semantic vector to obtain a first distance vector, and acquiring a vector distance between the semantic vector to be fed forward and the average semantic vector to obtain a second distance vector;

Based on the standard sub-assembly, vector standards are carried out on the second semantic vector to be normalized and the semantic vector to be fed forward, and a standard semantic vector is obtained;

vector scaling is carried out on the first distance vector and the standard semantic vector based on the scaling sub-component to obtain a first scaling vector, and vector scaling is carried out on the second distance vector and the standard semantic vector to obtain a second scaling vector;

generating a first weight feature corresponding to the conditional sharing semantic vector and a second weight feature corresponding to the conditional sharing semantic vector;

based on the weighting sub-component, carrying out weighted fusion on the first scaling vector, the second scaling vector and the first weight feature to obtain a semantic vector to be fused;

and carrying out vector fusion on the second weight characteristic and the semantic vector to be fused based on the fusion sub-component to obtain the target semantic vector.

5. The method of claim 1, wherein the vector recognition of the target semantic vector results in a triplet comprising the first entity word, a second entity word belonging to a second part of speech, and a relational entity word, comprising:

Acquiring a text recognition model; the text recognition model comprises a relation recognition layer;

inputting the target semantic vector into the relation recognition layer, and carrying out vector recognition on the target semantic vector based on the relation recognition layer to obtain a recognition semantic vector;

based on the recognition semantic vector, a triplet is generated that includes the first entity word, a second entity word that belongs to a second part of speech, and a relationship entity word.

6. The method of claim 1, wherein the obtaining a shared semantic vector corresponding to each word segment in the text comprises:

the text is segmented based on the input layer, at least two segmented words are obtained, and the at least two segmented words are respectively input into the shared coding layer;

and respectively carrying out vector coding on the at least two segmented words based on the shared coding layer to obtain shared semantic vectors respectively corresponding to each segmented word.

7. The method of claim 1, wherein the obtaining, based on the shared semantic vector, a first entity word belonging to a first part of speech in the text comprises:

Acquiring a text recognition model; the text recognition model comprises a second coding layer, an entity recognition layer and a decoding layer;

inputting the shared semantic vector into the second coding layer, and carrying out vector coding on the shared semantic vector based on the second coding layer to obtain a semantic vector to be identified;

inputting the semantic vector to be identified into the entity identification layer, and carrying out vector identification on the semantic vector to be identified based on the entity identification layer to obtain a semantic vector to be decoded for representing the first entity word;

inputting the semantic vector to be decoded for representing the first entity word into the decoding layer, and performing vector decoding on the semantic vector to be decoded for representing the first entity word based on the decoding layer to obtain the first entity word belonging to the first part of speech in the text.

8. The method of claim 7, wherein the semantic vector to be identified comprises semantic vector a to be identified _b B is a positive integer, and b is smaller than or equal to the total number corresponding to the semantic vector to be identified; the entity recognition layer comprises a semantic vector A aiming at the to-be-recognized _b Entity identification component C of (a) _b ；

Inputting the semantic vector to be identified into the entity identification layer, carrying out vector identification on the semantic vector to be identified based on the entity identification layer to obtain a semantic vector to be decoded for representing the first entity word, wherein the method comprises the following steps of:

if the entity identifies component C _b For the first entity recognition component in the entity recognition layer, then at the entity recognition component C _b In pairs of the semantic vectors A to be identified _b Vector recognition is carried out to obtain the semantic vector A to be recognized _b Corresponding semantic vector D to be decoded _b ；

If the entity identifies component C _b Not being the first entity identification component in the entity identification layer, then at the entity identification component C _b In the semantic vector A to be identified _b And the semantic vector A to be identified _b Target to-be-identified semantic vector with position association relation proceeds directionQuantity fusion is carried out to obtain the semantic vector A to be identified _b Corresponding semantic vector D to be decoded _b The method comprises the steps of carrying out a first treatment on the surface of the The target semantic vector to be identified belongs to the semantic vector to be identified;

and obtaining the semantic vector to be decoded for representing the first entity word based on the semantic vector to be decoded corresponding to each semantic vector to be identified.

9. The method of claim 1, wherein the triples include at least two triples for the first entity word; the at least two triples include triplet G _h H is a positive integer, and h is less than or equal to the total number of the at least two triples;

the method further comprises the steps of:

determining the triplet G _h The association relationship attribute represented by the relationship entity word in the list is a negative association relationship, and if the association relationship attribute is a negative association relationship, the object represented by the first entity word is determined to be a target object;

acquiring triples comprising short relation entity words from the at least two triples, and determining second entity words in the triples comprising the short relation entity words as short entity words; the object represented by the entity word for short is equal to the target object; the relationship entity words refer to relationship entity words with characterized association relationship attributes being abbreviated as association relationships;

acquiring triples comprising attribution relation entity words from the at least two triples, and determining second entity words in the triples comprising the attribution relation entity words as attribution entity words; the object characterized by the attribution entity word is attributed to the target object; the attribution relation entity words refer to relation entity words with attribution relation attribute of the characterization as attribution relation;

associative storage of the triplet G _h The short relation entity word, the attribution relation entity word and the attribution entity word.

10. A method of data processing, comprising:

acquiring a training sample set; the training sample set comprises sample text, a first tag entity word in the sample text, and a tag triplet associated with the sample text; the part of speech of the first tag entity word belongs to a first part of speech; the tag triples comprise the first tag entity words, second tag entity words belonging to second entity words and tag relation entity words; the first part of speech is different from the second part of speech; the second tag entity word in the tag triplet belongs to the sample text; the label relation entity words in the label triples are used for representing association relations between the first label entity words and the second label entity words in the label triples;

acquiring a first predicted entity word in the sample text based on the predicted shared semantic vector;

vector recognition is carried out on the prediction target semantic vector, and a prediction triplet comprising the first prediction entity word, the second prediction entity word and the prediction relation entity word is obtained; a second predicted entity word in the predicted triplet belongs to the sample text;

according to the first predicted entity word, the first tag entity word, the predicted triplet and the tag triplet, parameters in the text recognition initial model are adjusted, and a text recognition model is generated; the text recognition model is used to generate triples for text.

11. The method of claim 10, wherein the adjusting parameters in the initial model for text recognition based on the first predicted entity word, the first tag entity word, the predicted triplet, and the tag triplet, generating a model for text recognition comprises:

Generating an entity loss value according to the first predicted entity word and the first tag entity word;

generating a relation loss value according to the prediction triplet and the label triplet;

determining a total loss value corresponding to the text recognition initial model according to the entity loss value and the relation loss value;

and adjusting parameters in the text recognition initial model according to the total loss value to generate the text recognition model.

12. A data processing apparatus, comprising:

the second acquisition module is used for acquiring a first entity word belonging to a first part of speech in the text based on the shared semantic vector;

the second generation module is used for carrying out vector recognition on the target semantic vector to obtain a triplet comprising the first entity word, the second entity word belonging to the second part of speech and the relation entity word; the second entity word in the triplet belongs to the text; the relationship entity words in the triples are used for representing the association relationship between the first entity words and the second entity words in the triples.

13. A computer device, comprising: a processor, a memory, and a network interface; the processor is connected to the memory and the network interface, wherein the network interface is configured to provide a data communication function, the memory is configured to store a computer program, and the processor is configured to invoke the computer program to cause the computer device to perform the method of any of claims 1 to 11.

14. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program adapted to be loaded and executed by a processor to cause a computer device having the processor to perform the method of any of claims 1-11.

15. A computer program product, characterized in that the computer program product comprises computer instructions stored in a computer-readable storage medium, the computer instructions being adapted to be read and executed by a processor to cause a computer device having the processor to perform the method according to any of claims 1-11.