CN115221888A - Entity mention identification method, device, equipment and storage medium - Google Patents

Entity mention identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN115221888A
CN115221888A CN202110408050.3A CN202110408050A CN115221888A CN 115221888 A CN115221888 A CN 115221888A CN 202110408050 A CN202110408050 A CN 202110408050A CN 115221888 A CN115221888 A CN 115221888A
Authority
CN
China
Prior art keywords
entity
candidate entity
mention
linked
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110408050.3A
Other languages
Chinese (zh)
Inventor
唐弘胤
孙兴武
张富峥
王仲远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202110408050.3A priority Critical patent/CN115221888A/en
Publication of CN115221888A publication Critical patent/CN115221888A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses an entity mention identification method, device, equipment and storage medium, and belongs to the technical field of computers. The method comprises the following steps: acquiring a text to be processed; determining at least two candidate entity mentions in the text to be processed; for any candidate entity mention, calculating a first semantic similarity between any candidate entity mention and other candidate entity mentions, wherein the other candidate entity mentions are at least one candidate entity mention except any candidate entity mention in at least two candidate entity mentions; determining the global semantic similarity of any candidate entity mention by using the first semantic similarity between any candidate entity mention and other candidate entity mentions; and screening target entity mentions from the candidate entity mentions according to the global semantic similarity of the candidate entity mentions. According to the method and the device, the target entity mention is further identified through semantic similarity between the candidate entity mentions in the text, and the identification accuracy of the entity mention can be improved.

Description

Entity mention identification method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to an entity reference identification method, an entity reference identification device, entity reference equipment and a storage medium.
Background
Natural language analysis is an important technology, and covers the technical fields of information retrieval, information extraction, natural language question answering and the like, wherein an entity linking technology is an important component in natural language analysis.
The entity linking technology is used for linking entity mentions in the text to entities in a knowledge base, which are actually referred to by the entity mentions, wherein the entity mentions refer to entities appearing in the text. For example, for the text "world cup has not started", which contains the entity reference "world cup", the entity linking technology is adopted to link the entity reference "world cup" to the entity "international footlink world cup" in the knowledge base.
In the related art, the deep learning models of two tasks can be adopted for entity linkage. Specifically, text can be input into a deep learning model, one task of the deep learning model is an entity mention identification task, and entity mentions in the input text can be identified; another task of the deep learning model is an entity mention classification task, which can match identified entity mentions with entities to determine the category to which the entity mention belongs, i.e., to determine the entity to which the entity mention is linked.
From the above, in the entity linking technology, the identification of entity mentions is crucial, and the effect of entity linking is greatly influenced. The related art identifies entity mentions in the text through a deep learning model, and extracts the entity mentions from the text only according to semantic features of the text, so that the extracted entity mentions may be wrong, and the identification accuracy of the entity mentions is low.
Disclosure of Invention
The embodiment of the application provides an entity reference identification method, an entity reference identification device, entity reference equipment and a storage medium, which can be used for solving the problems in the related art. The technical scheme is as follows:
in one aspect, an embodiment of the present application provides an entity-referred identification method, where the method includes:
acquiring a text to be processed;
determining at least two candidate entity mentions in the text to be processed;
for any candidate entity mention, calculating a first semantic similarity between the any candidate entity mention and other candidate entity mentions, wherein the other candidate entity mentions are at least one candidate entity mention except the any candidate entity mention in the at least two candidate entity mentions;
determining a global semantic similarity of any candidate entity mention by using a first semantic similarity between the any candidate entity mention and the other candidate entity mentions;
and screening target entity mentions from the candidate entity mentions according to the global semantic similarity of the candidate entity mentions.
In one possible implementation, the calculating a first semantic similarity between the any candidate entity mention and the other candidate entity mentions includes:
determining any candidate entity and at least one corresponding entity to be linked from an entity library;
determining semantic vectors mentioned by each candidate entity and semantic vectors of each entity to be linked;
fusing the semantic vector mentioned by any candidate entity and corresponding entity to be linked to obtain a fused vector mentioned by any candidate entity;
and calculating a first semantic similarity between any candidate entity mention and other candidate entity mentions by using the fusion vector of each candidate entity mention.
In a possible implementation manner, the fusing the semantic vector mentioned by any candidate entity with the semantic vector mentioned by any candidate entity and the semantic vector mentioned by each corresponding entity to be linked to obtain a fused vector mentioned by any candidate entity includes:
calculating a second semantic similarity between the any candidate entity mention and each entity to be linked corresponding to the any candidate entity mention by using the semantic vector of the any candidate entity mention and the semantic vector of each entity to be linked corresponding to the any candidate entity mention;
and determining a fusion vector of any candidate entity mention by utilizing a second semantic similarity between any candidate entity mention and each entity to be linked corresponding to any candidate entity mention and a semantic vector of each entity to be linked corresponding to any candidate entity mention.
In one possible implementation, the determining the global semantic similarity of any candidate entity mention by using the first semantic similarity between the any candidate entity mention and the other candidate entity mention includes:
determining the global semantic similarity of any candidate entity mention by using the initial semantic similarity of each candidate entity mention and the first semantic similarity between any candidate entity mention and other candidate entity mentions.
In one possible implementation, the method further includes:
and determining the initial semantic similarity mentioned by any candidate entity by utilizing the global semantic similarity of any candidate entity mentioned and each corresponding entity to be linked and the second semantic similarity between any candidate entity mentioned and each corresponding entity to be linked.
In one possible implementation, the method further includes:
for any entity to be linked, calculating a third semantic similarity between the any entity to be linked and other entities to be linked, wherein the other entities to be linked are at least one entity to be linked except the any entity to be linked in each corresponding entity to be linked;
and determining the global semantic similarity of any entity to be linked by utilizing the initial semantic similarity of each entity to be linked and the third semantic similarity between any entity to be linked and other entities to be linked.
In one possible implementation, the method further includes:
and determining the initial semantic similarity of any entity to be linked by using the second semantic similarity between any entity to be linked and each candidate entity corresponding to the entity to be linked.
In one possible implementation, the method further includes:
determining a target link entity from each entity to be linked corresponding to any target entity mention according to the global semantic similarity of each entity to be linked corresponding to any target entity mention;
and linking any target entity mention with the target link entity.
In another aspect, an entity-referred identification apparatus is provided, the apparatus including:
the acquisition module is used for acquiring a text to be processed;
the determining module is used for determining at least two candidate entity mentions in the text to be processed;
a calculation module, configured to calculate, for any candidate entity mention, a first semantic similarity between the any candidate entity mention and other candidate entity mentions, where the other candidate entity mentions are at least one candidate entity mention other than the any candidate entity mention in the at least two candidate entity mentions;
the determining module is further configured to determine a global semantic similarity of the any candidate entity mention by using a first semantic similarity between the any candidate entity mention and the other candidate entity mentions;
and the screening module is used for screening the target entity mention from the candidate entity mentions according to the global semantic similarity of the candidate entity mentions.
In a possible implementation manner, the calculation module is configured to determine, from an entity library, that any candidate entity refers to a corresponding at least one entity to be linked; determining semantic vectors mentioned by the candidate entities and semantic vectors of the entities to be linked; fusing the semantic vector mentioned by any candidate entity and corresponding entity to be linked to obtain a fused vector mentioned by any candidate entity; and calculating a first semantic similarity between any candidate entity mention and other candidate entity mentions by using the fusion vector of each candidate entity mention.
In a possible implementation manner, the calculating module is configured to calculate a second semantic similarity between the any candidate entity mention and each entity to be linked that the any candidate entity mention and each corresponding entity to be linked that the any candidate entity mention by using the semantic vector mentioned by the any candidate entity and each corresponding entity to be linked; and determining a fusion vector mentioned by any candidate entity by utilizing a second semantic similarity between any candidate entity mentioned and each entity to be linked corresponding to any candidate entity mentioned and a semantic vector of each entity to be linked corresponding to any candidate entity mentioned.
In a possible implementation manner, the determining module is configured to determine the global semantic similarity of the candidate entity mention by using the initial semantic similarity of the candidate entity mention and the first semantic similarity between the candidate entity mention and the other candidate entity mentions.
In a possible implementation manner, the determining module is further configured to determine an initial semantic similarity mentioned by any candidate entity by using a global semantic similarity mentioned by any candidate entity and corresponding to each entity to be linked, and a second semantic similarity mentioned by any candidate entity and corresponding to each entity to be linked.
In a possible implementation manner, the calculating module is further configured to calculate, for any entity to be linked, a third semantic similarity between the any entity to be linked and other entities to be linked, where the other entities to be linked are the candidate entities and at least one entity to be linked except the any entity to be linked in the corresponding entities to be linked;
the determining module is further configured to determine the global semantic similarity of any entity to be linked by using the initial semantic similarity of each entity to be linked and the third semantic similarity between any entity to be linked and the other entities to be linked.
In a possible implementation manner, the determining module is further configured to determine an initial semantic similarity of any entity to be linked by using a second semantic similarity between the any entity to be linked and each candidate entity corresponding to the any entity to be linked.
In one possible implementation, the apparatus further includes a linking module, wherein,
the determining module is further configured to determine a target link entity from the entities to be linked corresponding to the any target entity mention according to the global semantic similarity of the entities to be linked corresponding to the any target entity mention;
the link module is used for linking any target entity mentioned with the target link entity.
In another aspect, a computer device is provided, the computer device comprising a processor and a memory, the memory having stored therein at least one instruction, which when executed by the processor, causes the computer device to implement any of the entity-referenced identification methods described above.
In another aspect, a computer-readable storage medium is provided, in which at least one instruction is stored, and when executed, the at least one instruction implements the entity-mentioned identification method described above.
In another aspect, a computer program or a computer program product is provided, in which at least one computer instruction is stored, and the at least one computer instruction is loaded and executed by a processor to implement any of the entity-mentioned identification methods described above.
The technical scheme provided by the embodiment of the application at least has the following beneficial effects:
after each candidate entity mention in the text is preliminarily identified, the semantic similarity between every two candidate entity mentions in the text is calculated, the semantic similarity between one candidate entity mention and other candidate entity mentions is utilized, the global semantic similarity of the candidate entity mention is calculated, and therefore whether the candidate entity mention is the target entity mention is determined, on the basis of preliminarily identifying the candidate entity mention in the text, the target entity mention is further identified by utilizing the semantic similarity between the candidate entity mentions in the text, and the identification accuracy of the entity mention is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic illustration of an implementation environment provided by an embodiment of the present application;
fig. 2 is a flowchart of an entity reference identification method provided in an embodiment of the present application;
FIG. 3 is a schematic diagram of relationships between entity mentions and entities provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of an entity-referenced identification device according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
An embodiment of the present application provides an entity-mentioned identification method, as shown in fig. 1, fig. 1 is a schematic diagram of an implementation environment of the entity-mentioned identification method provided in the embodiment of the present application, where the implementation environment includes an electronic device 11, and the electronic device 11 may include at least one of a terminal device or a server. It is understood that the identification method mentioned in the embodiment of the present application may be performed by the electronic device 11.
The terminal device may be at least one of a smart phone, a game console, a desktop computer, a tablet computer, an e-book reader, an MP3 (Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3) player, an MP4 (Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4) player, and a laptop computer.
The server may be one server, or a server cluster formed by multiple servers, or any one of a cloud computing platform and a virtualization center, which is not limited in this embodiment of the present application. The server can be in communication connection with the terminal device through a wired network or a wireless network. The server may have functions of data processing, data storage, data transceiving, and the like, and is not limited in this embodiment of the application.
Fig. 2 shows a method for identifying an entity mention provided in an embodiment of the present application, where fig. 2 is a flowchart of a method for identifying an entity mention provided in an embodiment of the present application, and the method is executed by an electronic device as an example, the method provided in the embodiment of the present application may include steps S21 to S25.
And S21, acquiring a text to be processed.
In the embodiment of the present application, the obtaining manner of the text to be processed is not limited. In a possible implementation manner, a user may input a search term in any application program, and the electronic device obtains the search term input by the user and uses the search term input by the user as a text to be processed.
In another possible implementation manner, a long text may be captured from a network, the long text may be divided into at least two short texts in any manner, and any short text may be used as a text to be processed. For example, a piece of news may be divided into at least two segments in a segmented manner, and each segment of news is used as a text to be processed.
In yet another possible implementation, information containing at least one information type may be obtained, the information type including but not limited to text, video, picture, audio. The text to be processed may be extracted from the acquired information or generated based on the acquired information.
For example, the acquired information is a video, and a text to be processed is extracted from subtitles of each video frame. For another example, the obtained information is audio, a voice conversion technology is adopted to convert the audio into a corresponding text, and the text obtained through conversion is used as a text to be processed.
And step S22, determining at least two candidate entity mentions in the text to be processed.
In a possible implementation manner, the entity mention library includes a plurality of entity mentions collected in advance, word segmentation processing may be performed on a text to be processed, a word segmentation result is matched with each entity mention in the entity mention library, and a word segmentation result which is successfully matched is referred as a candidate entity.
In another possible implementation manner, the text to be processed may be input into a pre-trained network model, and each candidate entity in the text output by the network model is referred to. The structure of the network model is not limited. Optionally, the network model may include a cascaded feature extraction submodel and an entity mention identification submodel. The feature extraction submodel is used for extracting semantic features of a text to be processed, and the specific structure of the feature extraction submodel is not limited, for example, the feature extraction submodel may be a Bidirectional Encoder characterization from transforms (BERT) network based on a transformer; the entity mention recognition submodel may recognize, according to semantic features of the text to be processed, each entity mention in the text to be processed, any entity mention that is a candidate entity mention in the embodiment of the present application, and the structure of the entity mention recognition submodel includes, but is not limited to, a Conditional Random Field (CRF) network.
When the entity mention identification submodel is a CRF network, the CRF network can determine the category of each character in the text to be processed according to the semantic features of the text to be processed, and candidate entity mention in the text to be processed is obtained based on the category of each character. For any character, the category to which the character belongs may be an entity mention category or a non-entity mention category, wherein the entity mention category may include a beginning part of an entity mention and a middle part of an entity mention, or the entity mention category may include the beginning part of an entity mention, the middle part of an entity mention, and an end part of an entity mention.
For example, a three-dimensional notation (BIO) is used to label a training text, and for any character in the training text, if the label information of the character is B, the character is the beginning part of an entity mention, if the label information of the character is I, the character is the middle part of the entity mention, if the label information of the character is O, the character does not belong to the entity mention, that is, the character is the non-entity mention part. After the CRF network is obtained by training the training text, the CRF network can output the category to which each character in the text to be processed belongs, the category to which any character belongs can be any one of B, I or O, a continuous character with the category of B and a plurality of characters with the category of I are combined to form an entity mention, and the entity mention obtained by combination is taken as a candidate entity mention in the embodiment of the application.
Step S23, for any candidate entity mention, calculating a first semantic similarity between any candidate entity mention and other candidate entity mentions, where the other candidate entity mentions are at least one candidate entity mention other than any candidate entity mention in the at least two candidate entity mentions.
In the embodiment of the application, for any candidate entity mention, other candidate entity mentions are part or all of candidate entity mentions except any candidate entity mention in all candidate entity mentions identified from the text to be processed. For any candidate entity mention, the corresponding other candidate entity mention comprises at least one candidate entity mention, and the first semantic similarity between the any candidate entity mention and each candidate entity mention in the other candidate entity mentions is calculated, wherein the calculation mode of the semantic similarity is not limited herein.
In one possible implementation, in calculating the first semantic similarity between any two candidate entity mentions, the feature extraction model (which may be a feature extraction sub-model in the aforementioned network model) may be used to extract respective semantic vectors of the two candidate entity mentions, and the first semantic similarity between the two candidate entity mentions may be calculated by using the semantic vectors of the two candidate entity mentions.
In another possible implementation manner, calculating a first semantic similarity between any candidate entity mention and other candidate entity mentions includes: determining any candidate entity and at least one corresponding entity to be linked from an entity library; determining semantic vectors mentioned by each candidate entity and semantic vectors of each entity to be linked; fusing the semantic vector mentioned by any candidate entity and the semantic vector of each entity to be linked which is mentioned by any candidate entity to obtain a fused vector mentioned by any candidate entity; and calculating a first semantic similarity between any candidate entity mention and other candidate entity mentions by using the fusion vector of each candidate entity mention.
In the embodiment of the application, the entity library includes a large number of entities, for any candidate entity mention, the any candidate entity mention is matched with each entity in the entity library, and the matched entity is taken as the any candidate entity mention and the corresponding entity to be linked. It is to be understood that the entity to be linked may be one or more, and the embodiment of the present application is not limited herein.
In one possible implementation manner, the entity library records correspondence between entity mentions and entities, wherein any entity mention corresponds to at least one entity, and any entity corresponds to at least one entity mention. And matching any candidate entity mention with each entity mention in the entity library, wherein if any candidate entity mention is the same as one entity mention in the entity library, any candidate entity mention is successfully matched with the entity mention in the entity library, and the entity mention in the entity library is used as any candidate entity mention corresponding to at least one entity to be linked.
For any candidate entity mention, the semantic vector mentioned by the candidate entity can be extracted by using a feature extraction model (the feature extraction model can be a feature extraction sub-model in the aforementioned network model), and similarly, for any entity to be linked, the semantic vector of the entity to be linked can also be extracted by using the above feature extraction model. And then, fusing the semantic vector mentioned by any candidate entity with the semantic vector of each candidate entity and each corresponding entity to be linked to obtain a fused vector mentioned by any candidate entity. Therefore, when calculating the first semantic similarity between any candidate entity mention and each other candidate entity mention, the first semantic similarity between any candidate entity mention and each other candidate entity mention can be calculated by using the fused vector mentioned by any candidate entity mention and the fused vector mentioned by each other candidate entity mention.
The present application further provides a possible implementation manner of vector fusion, in which a semantic vector mentioned by any candidate entity and semantic vectors mentioned by any candidate entity and corresponding to each entity to be linked are fused to obtain a fusion vector mentioned by any candidate entity, including: calculating a second semantic similarity between any candidate entity mention and each entity to be linked corresponding to any candidate entity mention by utilizing the semantic vector mentioned by any candidate entity and each corresponding entity to be linked; and determining a fusion vector of any candidate entity mention by utilizing a second semantic similarity between any candidate entity mention and each entity to be linked corresponding to any candidate entity mention and a semantic vector of any candidate entity mention and each corresponding entity to be linked.
In the embodiment of the present application, for any candidate entity mention, the candidate entity mention corresponds to at least one entity to be linked, a second semantic similarity between the candidate entity mention and each corresponding entity to be linked may be respectively calculated by using the semantic vector mentioned by the candidate entity and the semantic vector of each corresponding entity to be linked, and a fusion vector mentioned by the candidate entity may be calculated by using the second semantic similarity between the candidate entity mention and each corresponding entity to be linked and the semantic vector of each corresponding entity to be linked.
In a possible implementation manner, for any candidate entity mention, when the fusion vector mentioned by the candidate entity mention is calculated by using the second semantic similarity between the candidate entity mention and each corresponding entity to be linked and the semantic vector of each corresponding entity to be linked, the semantic vector of each entity to be linked and the respective corresponding second semantic similarity may be multiplied, and the multiplication results are added to obtain the fusion vector mentioned by the candidate entity. It can be understood that, if any candidate entity is mentioned as corresponding to only one entity to be linked, the semantic vector of the entity to be linked and the corresponding second semantic similarity may be multiplied to obtain the fusion vector mentioned by any candidate entity.
For example, for a candidate entity X, which corresponds to two entities to be linked X1 and X2, the second semantic similarity between X and X1 may be multiplied by the semantic vector of X1, the second semantic similarity between X and X2 may be multiplied by the semantic vector of X2, and the two multiplication results are added to obtain a fusion vector of X; for a candidate entity mentioned Y, which corresponds to an entity to be connected Y, the second semantic similarity between Y and Y can be multiplied by the semantic vector of Y to obtain a fusion vector of Y, and so on.
And step S24, determining the global semantic similarity of any candidate entity mention by utilizing the first semantic similarity between any candidate entity mention and other candidate entity mentions.
In the embodiment of the application, for any candidate entity mention, the global semantic similarity mentioned by any candidate entity can be determined by using the first semantic similarity between the any candidate entity mention and each other candidate entity mention in any manner. The global semantic similarity mentioned by any candidate entity may reflect the degree of similarity between the mention of any candidate entity and the mention of the entity, and the global semantic similarity mentioned by any candidate entity may be a probability that the candidate entity mentions as the mention of the entity or a score that the candidate entity mentions as the mention of the entity.
In one possible implementation manner, determining the global semantic similarity of any candidate entity mention by using the first semantic similarity between any candidate entity mention and other candidate entity mentions comprises: and determining the global semantic similarity of any candidate entity mention by utilizing the initial semantic similarity of each candidate entity mention and the first semantic similarity between any candidate entity mention and other candidate entity mentions.
For any candidate entity mention, the global semantic similarity of any candidate entity mention can be determined by utilizing the initial semantic similarity of any candidate entity mention, the initial semantic similarity of other candidate entity mentions and the first semantic similarity between any candidate entity mention and other candidate entity mentions.
In a possible implementation manner, for any candidate entity mention, a first coefficient and a second coefficient may be preset, and the sum of the first coefficient and the second coefficient is 1, and the first coefficient may be multiplied by the initial semantic similarity mentioned by any candidate entity to obtain a product part; and multiplying the initial semantic similarity mentioned by other candidate entities by the corresponding first semantic similarity respectively, adding the product results, multiplying the addition result by a second coefficient to obtain another product part, and taking the sum of the two product parts as the global semantic similarity mentioned by any candidate entity.
For example, if the candidate entity refers to X, and other candidate entities refer to Y and Z, the initial semantic similarity of X is multiplied by the first coefficient to obtain a product part; multiplying the initial semantic similarity of Y with the first semantic similarity between X and Y to obtain a product result, multiplying the initial semantic similarity of Z with the first semantic similarity between X and Z to obtain another product result, adding the two product results, multiplying the addition result with a second coefficient to obtain another product part, and taking the sum of the two product parts as the global semantic similarity mentioned by any candidate entity.
Optionally, the entity-mentioned identification method in the embodiment of the present application further includes: and determining the initial semantic similarity of any candidate entity mention by utilizing the global semantic similarity of any candidate entity mention and each corresponding entity to be linked and the second semantic similarity between any candidate entity mention and each corresponding entity to be linked. Alternatively, this step may be performed before "determining the global semantic similarity of any candidate entity mention using the initial semantic similarity of each candidate entity mention and the first semantic similarity between any candidate entity mention and other candidate entity mentions".
The global semantic similarity of any entity to be linked may reflect the degree of similarity between any entity to be linked and the target link entity, and the global semantic similarity of any entity to be linked may be a probability that any entity to be linked is the target link entity, or may be a score that any entity to be linked is the target link entity.
For any candidate entity mention, determining the initial semantic similarity of any candidate entity mention by utilizing the global semantic similarity of any candidate entity mention and each corresponding entity to be linked and the second semantic similarity between any candidate entity mention and each corresponding entity to be linked.
In a possible implementation manner, for any candidate entity mention, when the initial semantic similarity mentioned by any candidate entity is determined by using the global semantic similarity of any candidate entity mention and each corresponding entity to be linked and the second semantic similarity between any candidate entity mention and each corresponding entity to be linked, the global semantic similarity of each corresponding entity to be linked and each corresponding second semantic similarity can be multiplied, and the multiplication results are added to obtain the initial semantic similarity mentioned by any candidate entity. It can be understood that, if any candidate entity mention only corresponds to one entity to be linked, the global semantic similarity of the entity to be linked may be multiplied by the corresponding second semantic similarity to obtain the initial semantic similarity mentioned by any candidate entity.
For a candidate entity, referring to X, which corresponds to two entities to be linked, namely X1 and X2, the global semantic similarity of X1 can be multiplied by the second semantic similarity between X and X1, the global semantic similarity of X2 can be multiplied by the second semantic similarity between X and X2, and the two multiplication results are added to obtain the initial semantic similarity of X; for a candidate entity mentioning Y, which corresponds to an entity to be linked Y, the global semantic similarity of Y may be multiplied by a second semantic similarity between Y and Y to obtain an initial semantic similarity of Y, and so on.
In a possible implementation manner, an embodiment of the present application further provides a calculation manner of global semantic similarity of entities to be linked, and the identification method of entities in the embodiment of the present application further includes: for any entity to be linked, calculating a third semantic similarity between any entity to be linked and other entities to be linked, wherein the other entities to be linked refer to each candidate entity and at least one entity to be linked except any entity to be linked in each corresponding entity to be linked; and determining the global semantic similarity of any entity to be linked by using the initial semantic similarity of each entity to be linked and the third semantic similarity between any entity to be linked and other entities to be linked. Alternatively, this step may be performed before "determining the initial semantic similarity of any candidate entity mention using the global semantic similarity of any candidate entity mention of the corresponding respective to-be-linked entity and the second semantic similarity between any candidate entity mention and any candidate entity mention of the corresponding respective to-be-linked entity".
Illustratively, third semantic similarity between any entity to be linked and each other entity to be linked is calculated, and for any entity to be linked, the global semantic similarity of any entity to be linked can be determined by using the initial semantic similarity of any entity to be linked, the initial semantic similarity of each other entity to be linked, and the third semantic similarity between any entity to be linked and each other entity to be linked.
In a possible implementation manner, a third coefficient and a fourth coefficient may be set, the sum of the third coefficient and the fourth coefficient is 1, and the third coefficient is multiplied by the initial semantic similarity of any entity to be linked to obtain a product part; and multiplying other entities to be linked by the third semantic similarity corresponding to the other entities to be linked, adding the multiplication results, and multiplying the addition result by a fourth coefficient to obtain another product part, wherein the sum of the two product parts is the global semantic similarity of any entity to be linked.
For example, for an entity x1 to be linked, if other entities to be linked are x2, y, z1 and z2, the initial semantic similarity of x1 may be multiplied by a third coefficient to obtain a product part; multiplying the initial semantic similarity of x2 by the third semantic similarity between x1 and x2, multiplying the initial semantic similarity of y by the third semantic similarity between x1 and y, multiplying the initial semantic similarity of z1 by the third semantic similarity between x1 and z1, multiplying the initial semantic similarity of z2 by the third semantic similarity between x1 and z2, adding the four product results, and multiplying the addition result by a fourth coefficient to obtain another product part, wherein the sum of the two product parts is the global semantic similarity of x 1.
In a possible implementation manner, the method for identifying an entity mention provided in the embodiment of the present application further includes: and determining the initial semantic similarity of any entity to be linked by using the second semantic similarity between any entity to be linked and each candidate entity corresponding to any entity to be linked. Alternatively, this step may be performed before "determining the global semantic similarity of any entity to be linked by using the initial semantic similarity of each entity to be linked and the third semantic similarity between any entity to be linked and other entities to be linked".
In the embodiment of the application, the initial semantic similarity of any entity to be linked can be determined by using the second semantic similarity between any entity to be linked and each candidate entity corresponding to any entity to be linked. For example, for an entity X1 to be linked, the corresponding candidate entity of which refers to X, the initial semantic similarity of X1 may be determined by using the second semantic similarity between X and X1.
It is understood that one to-be-linked entity may be mentioned corresponding to at least one candidate entity, and is not limited in the embodiments of the present application.
And S25, screening target entity mentions from the candidate entity mentions according to the global semantic similarity of the candidate entity mentions.
Optionally, a semantic similarity criterion line may be set in advance, and for any candidate entity mention, if the global semantic similarity mentioned by any candidate entity is not less than the semantic similarity criterion line, the any candidate entity mention is a target entity mention, and if the global semantic similarity mentioned by any candidate entity is less than the semantic similarity criterion line, the any candidate entity mention is not a target entity mention.
The identification method for entity mentions provided by the embodiment of the application can preliminarily identify each candidate entity mention in the text, and by calculating semantic similarity between every two candidate entity mentions in the text, and by utilizing the semantic similarity between one candidate entity mention and other candidate entity mentions, the global semantic similarity of the candidate entity mention is calculated, so as to determine whether the candidate entity mention is the target entity mention, so that the target entity mention is further identified by utilizing the semantic similarity between the candidate entity mentions in the text on the basis of preliminarily identifying the candidate entity mention in the text, and the identification accuracy of the entity mention is improved.
In another possible implementation manner of the embodiment of the present application, the method for identifying an entity further includes: determining a target link entity from each entity to be linked corresponding to any target entity according to the global semantic similarity of each entity to be linked corresponding to any target entity; any target entity is referred to and linked with the target link entity. Alternatively, this step may be performed after step S25.
According to the method provided by the embodiment of the application, the global semantic similarity of each entity to be linked can be calculated. In a possible implementation manner, for any target entity mention, the maximum global semantic similarity may be determined according to the global semantic similarity of each entity to be linked corresponding to the any target entity mention, the entity to be linked corresponding to the maximum global semantic similarity is determined as a target linked entity, and the any target entity mention is linked with the target linked entity.
In another possible implementation manner, for any target entity mention, the global semantic similarity greater than or equal to a preset threshold may be determined according to the global semantic similarity of each entity to be linked corresponding to the any target entity mention, the entity to be linked corresponding to the determined global semantic similarity is determined as a target link entity, and the any target entity mention is linked with the target link entity.
In the method provided by the embodiment of the present application, in order to facilitate storage of each candidate entity mention, each entity to be linked, and each similarity (including the aforementioned first semantic similarity, second semantic similarity, and third semantic similarity), a relationship diagram between an entity mention and an entity may be determined according to each candidate entity mention, each entity to be linked, and each similarity.
As shown in fig. 3, fig. 3 is a schematic diagram of relationships between entity mentions and entities provided in the embodiment of the present application. Wherein, for the candidate entity X, two entities can be matched from the entity library, and the two entities are both the entities to be linked corresponding to X and are respectively marked as the entities to be linked X1 and X2; for the candidate entity, Y is mentioned, an entity can be matched from the entity library, and the entity is the entity to be linked corresponding to Y and is marked as the entity to be linked Y; for the candidate entity mentioned as Z, two entities can be matched from the entity library, and both the two entities are to-be-linked entities corresponding to Z and are respectively marked as to-be-linked entities Z1 and Z2.
Next, taking the relationship diagram shown in fig. 3 as an example, the identification method of the entity mention provided in the embodiment of the present application is illustrated.
For example, for a candidate entity referring to X, which corresponds to two entities to be linked X1 and X2, a second semantic similarity between X and X1 may be calculated by using the semantic vector of X and the semantic vector of X1, and a second semantic similarity between X and X2 may be calculated by using the semantic vector of X and the semantic vector of X2; for a candidate entity mention Y, which corresponds to an entity to be connected Y, a second semantic similarity between Y and Y may be calculated using the semantic vector of Y and the semantic vector of Y, and so on.
And then, according to the candidate entity mentions X, Y and Z and the second semantic similarity between the candidate entity mentions and the corresponding to-be-linked entities, constructing a relation schematic diagram between the entity mentions and the entities. In the relationship diagram between the entity mention and the entity, the node represents the candidate entity mention or the entity to be linked, and the line segment between the node represents the second semantic similarity between the candidate entity mention and the entity to be linked.
For the entity x1 to be linked, the other entities to be linked are x2, y, z1 and z2, then a third semantic similarity between x1 and x2, y, z1 and z2 can be calculated respectively, and so on. And adding a third semantic similarity in the relationship diagram, namely, in the relationship diagram, a line segment between any two entities to be linked represents the third semantic similarity between the two entities to be linked.
For the candidate entity, referring to X, wherein X corresponds to the entities X1 and X2 to be linked, the semantic vector of X, the semantic vector of X1 and the semantic vector of X2 can be fused to obtain a fusion vector of X; and for the candidate entity, Y is mentioned, and Y corresponds to the entity Y to be linked, the semantic vector of Y and the semantic vector of Y can be fused to obtain a fusion vector of Y, and so on.
For a candidate entity mention X, X corresponding to two other candidate entity mentions, Y and Z for the candidate entity mention, respectively, a first semantic similarity between X and Y, and a first semantic similarity between X and Z may be calculated. When the first semantic similarity between X and Y is calculated, the first semantic similarity between X and Y can be calculated by using the fusion vector of X and the fusion vector of Y; when the first semantic similarity between X and Z is calculated, the first semantic similarity between X and Z may be calculated by using the fusion vector of X and the fusion vector of Z, and so on. A first semantic similarity is added in the relationship diagram, namely, in the relationship diagram, a line segment between any two candidate entity mentions represents the first semantic similarity between the two candidate entity mentions.
The finally obtained schematic diagram of the relationship between the entity mention and the entity is shown in fig. 3, the node represents the candidate entity mention or the entity to be linked, and the line segment between the two nodes represents the similarity between the two nodes at the two ends of the line segment. As shown in fig. 3, a node X represents a candidate entity mentioned X, a node X1 represents an entity X1 to be linked, a line segment between X and X1 represents a similarity between X and X1 (i.e., the second semantic similarity mentioned above), a line segment between X1 and X2 represents a similarity between X1 and X2 (i.e., the third semantic similarity mentioned above), and a line segment between X and Y represents a similarity between X and Y (i.e., the first semantic similarity mentioned above).
It can be understood that fig. 3 illustrates candidate entity mentions X, Y, and Z, and entities to be linked X1, X2, Y, Z1, and Z2 as examples, and in practical applications, the number of candidate entity mentions and entities to be linked is not limited.
In the embodiment of the present application, for an entity X1 to be linked, if a candidate entity corresponding to the entity X1 refers to X, the initial semantic similarity of X1 may be determined by using the second semantic similarity between X and X1, and so on. For the entity x1 to be linked, the other entities to be linked are x2, y, z1 and z2, and then the global semantic similarity of x1 can be calculated by using the initial semantic similarity of x1, x2, y, z1 and z2 and the third semantic similarity between x1 and x2, y, z1 and z2, and so on.
For a candidate entity mention X, which corresponds to two to-be-linked entities X1 and X2, the initial semantic similarity of X may be calculated by using the global semantic similarity of X1, the global semantic similarity of X2, the second semantic similarity between X and X1, and the second semantic similarity between X and X2, and so on.
For a candidate entity referring to X, other candidate entities referring to Y and Z as candidate entities, the global semantic similarity of X may be calculated by using the initial semantic similarity of X, the initial semantic similarity of Y, the initial semantic similarity of Z, the first semantic similarity between X and Y, and the first semantic similarity between X and Z, and so on.
And for the candidate entity, referring to X, if the global semantic similarity of X is not less than the semantic similarity standard line, referring to the target entity by X, and if the global semantic similarity of X is less than the semantic similarity standard line, referring to the target entity by X, and so on.
If the candidate entity mention X is the target entity mention, it corresponds to two to-be-linked entities X1 and X2. If the global semantic similarity of X1 is greater than the global semantic similarity of X2, X1 is a target link entity, and linking X with X1; if the global semantic similarity of X2 is greater than the global semantic similarity of X1, X2 is a target link entity, and linking X with X2; if the global semantic similarity of X1 is equal to the global semantic similarity of X2, X1 and X2 are both target link entities, and X is respectively linked with X1 and X2.
Based on the same technical concept, referring to fig. 4, an embodiment of the present application provides an entity-referred identification apparatus 40, including:
an obtaining module 41, configured to obtain a text to be processed;
a determining module 42, configured to determine at least two candidate entity mentions in the text to be processed;
a calculating module 43, configured to calculate, for any candidate entity mention, a first semantic similarity between the any candidate entity mention and other candidate entity mentions, where the other candidate entity mentions are at least one candidate entity mention other than any candidate entity mention in at least two candidate entity mentions;
the determining module 42 is further configured to determine a global semantic similarity of any candidate entity mention by using a first semantic similarity between any candidate entity mention and other candidate entity mentions;
and the screening module 44 is configured to screen the target entity mention from the candidate entity mentions according to the global semantic similarity of the candidate entity mention.
In a possible implementation manner, the calculating module 43 is configured to determine that any candidate entity refers to at least one corresponding entity to be linked from the entity library; determining semantic vectors mentioned by each candidate entity and semantic vectors of each entity to be linked; fusing the semantic vector mentioned by any candidate entity and the semantic vector of each entity to be linked which is mentioned by any candidate entity to obtain a fused vector mentioned by any candidate entity; and calculating a first semantic similarity between any candidate entity mention and other candidate entity mentions by using the fusion vector of each candidate entity mention.
In a possible implementation manner, the calculating module 43 is configured to calculate a second semantic similarity between any candidate entity mention and each entity to be linked corresponding to any candidate entity mention by using the semantic vector mentioned by any candidate entity and each corresponding entity to be linked; and determining a fusion vector mentioned by any candidate entity by utilizing the second semantic similarity between any candidate entity mention and each entity to be linked corresponding to any candidate entity mention and the semantic vector of any candidate entity mention and each corresponding entity to be linked.
In one possible implementation, the determining module 42 is configured to determine the global semantic similarity of any candidate entity mention by using the initial semantic similarity of each candidate entity mention and the first semantic similarity between any candidate entity mention and other candidate entity mentions.
In a possible implementation manner, the determining module 42 is further configured to determine the initial semantic similarity mentioned by any candidate entity by using the global semantic similarity mentioned by any candidate entity and each corresponding entity to be linked, and the second semantic similarity between any candidate entity mentioned and each corresponding entity to be linked mentioned by any candidate entity.
In a possible implementation manner, the calculating module 43 is further configured to calculate, for any entity to be linked, a third semantic similarity between the any entity to be linked and other entities to be linked, where the other entities to be linked refer to at least one entity to be linked, except for any entity to be linked, in each corresponding entity to be linked;
the determining module 42 is further configured to determine the global semantic similarity of any entity to be linked by using the initial semantic similarity of each entity to be linked and the third semantic similarity between any entity to be linked and other entities to be linked.
In a possible implementation manner, the determining module 42 is further configured to determine an initial semantic similarity of any entity to be linked by using a second semantic similarity between any entity to be linked and each candidate entity corresponding to any entity to be linked.
In a possible implementation, the entity-referred recognition means 40 further comprise a linking module, wherein,
the determining module 42 is further configured to determine a target link entity from each entity to be linked corresponding to any target entity mention according to the global semantic similarity of each entity to be linked corresponding to any target entity mention;
and the link module is used for linking any target entity mention with the target link entity.
In the embodiment of the application, each candidate entity mention in the text can be preliminarily identified, the semantic similarity between every two candidate entity mentions in the text is calculated, the semantic similarity between one candidate entity mention and other candidate entity mentions is utilized, and the global semantic similarity of the candidate entity mention is calculated, so that whether the candidate entity mention is the target entity mention is determined, on the basis of preliminarily identifying the candidate entity mention in the text, the target entity mention is further identified by utilizing the semantic similarity between the candidate entity mentions in the text, and the identification accuracy of the entity mention is improved.
It should be noted that, when the apparatus provided in the foregoing embodiment implements the functions thereof, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the apparatus may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and method embodiments provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiments, which are not described herein again.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The device may be a terminal, and may be, for example: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. A terminal may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.
In general, the terminal 500 includes: a processor 501 and a memory 502.
The processor 501 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 501 may be implemented in at least one hardware form of DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), PLA (Programmable Logic Array). The processor 501 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 501 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, the processor 501 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 502 may include one or more computer-readable storage media, which may be non-transitory. Memory 502 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 502 is used to store at least one instruction for execution by processor 501 to implement the entity-referenced identification methods provided by method embodiments herein.
In some embodiments, the terminal may further optionally include: a peripheral interface 503 and at least one peripheral. The processor 501, memory 502 and peripheral interface 503 may be connected by a bus or signal lines. Each peripheral may be connected to the peripheral interface 503 by a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 504, touch screen display 505, camera assembly 505, audio circuitry 507, positioning assembly 508, and power supply 509.
The peripheral interface 503 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 501 and the memory 502. In some embodiments, the processor 501, memory 502, and peripheral interface 503 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 501, the memory 502, and the peripheral interface 503 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.
The Radio Frequency circuit 504 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 504 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 504 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 504 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 504 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 504 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.
The display screen 505 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 505 is a touch display screen, the display screen 505 also has the ability to capture touch signals on or over the surface of the display screen 505. The touch signal may be input to the processor 501 as a control signal for processing. At this point, the display screen 505 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display screen 505 may be one, disposed on the front panel of the terminal; in other embodiments, the display 505 may be at least two, respectively disposed on different surfaces of the terminal or in a folded design; in still other embodiments, the display 505 may be a flexible display disposed on a curved surface or on a folded surface of the terminal. Even more, the display screen 505 can be arranged in a non-rectangular irregular figure, i.e. a shaped screen. The Display screen 505 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and other materials.
The camera assembly 506 is used to capture images or video. Optionally, camera assembly 506 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 506 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp and can be used for light compensation under different color temperatures.
Audio circuitry 507 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 501 for processing, or inputting the electric signals to the radio frequency circuit 504 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones can be arranged at different parts of the terminal respectively. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 501 or the radio frequency circuit 504 into sound waves. The loudspeaker can be a traditional film loudspeaker and can also be a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 507 may also include a headphone jack.
The positioning component 508 is used for locating the current geographic Location of the terminal to implement navigation or LBS (Location Based Service). The Positioning component 508 may be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, the grignard System in russia, or the galileo System in the european union.
A power supply 509 is used to supply power to the various components in the terminal. The power supply 509 may be alternating current, direct current, disposable or rechargeable. When power supply 509 comprises a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery can also be used to support fast charge technology.
In some embodiments, the terminal also includes one or more sensors 510. The one or more sensors 510 include, but are not limited to: acceleration sensor 511, gyro sensor 512, pressure sensor 513, fingerprint sensor 514, optical sensor 515, and proximity sensor 516.
The acceleration sensor 511 may detect the magnitude of acceleration on three coordinate axes of a coordinate system established with the terminal. For example, the acceleration sensor 511 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 501 may control the touch screen 505 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 511. The acceleration sensor 511 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 512 can detect the body direction and the rotation angle of the terminal, and the gyro sensor 512 and the acceleration sensor 511 can cooperate to acquire the 3D movement of the user on the terminal. The processor 501 may implement the following functions according to the data collected by the gyro sensor 512: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
The pressure sensor 513 may be disposed on a side frame of the terminal and/or under the touch display screen 505. When the pressure sensor 513 is disposed on the side frame of the terminal, the holding signal of the user to the terminal can be detected, and the processor 501 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 513. When the pressure sensor 513 is disposed at the lower layer of the touch display screen 505, the processor 501 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 505. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 514 is used for collecting the fingerprint of the user, and the processor 501 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 514, or the fingerprint sensor 514 identifies the identity of the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 501 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 514 may be disposed on the front, back, or side of the terminal. When a physical button or vendor Logo is provided on the terminal, the fingerprint sensor 514 may be integrated with the physical button or vendor Logo.
The optical sensor 515 is used to collect the ambient light intensity. In one embodiment, the processor 501 may control the display brightness of the touch screen 505 based on the ambient light intensity collected by the optical sensor 515. Specifically, when the ambient light intensity is higher, the display brightness of the touch display screen 505 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 505 is turned down. In another embodiment, processor 501 may also dynamically adjust the shooting parameters of camera head assembly 506 based on the ambient light intensity collected by optical sensor 515.
A proximity sensor 516, also known as a distance sensor, is typically provided on the front panel of the terminal. The proximity sensor 516 is used to capture the distance between the user and the front face of the terminal. In one embodiment, when the proximity sensor 516 detects that the distance between the user and the front surface of the terminal is gradually reduced, the processor 501 controls the touch display screen 505 to switch from the bright screen state to the dark screen state; when the proximity sensor 516 detects that the distance between the user and the front surface of the terminal is gradually increased, the processor 501 controls the touch display screen 505 to switch from the breath screen state to the bright screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 5 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
In an exemplary embodiment, a computer device is also provided that includes a processor and a memory having at least one instruction, at least one program, set of codes, or set of instructions stored therein. The at least one instruction, at least one program, set of codes, or set of instructions is configured to be executed by one or more processors to implement the method of identification mentioned for any of the entities above.
In an exemplary embodiment, there is also provided a computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions which, when executed by a processor of a computer device, implement the identification method mentioned in any of the above entities.
Alternatively, the computer-readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, there is also provided a computer program or a computer program product having at least one computer instruction stored therein, which is loaded and executed by a processor to implement the identification method mentioned in any of the above entities.
It should be understood that reference herein to "a plurality" means two or more. "and/or" describes the association relationship of the associated object, indicating that there may be three relationships, for example, a and/or B, which may indicate: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
The above description is only exemplary of the application and should not be taken as limiting the application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the application should be included in the protection scope of the application.

Claims (10)

1. An entity mention identification method, the method comprising:
acquiring a text to be processed;
determining at least two candidate entity mentions in the text to be processed;
for any candidate entity mention, calculating a first semantic similarity between the any candidate entity mention and other candidate entity mentions, wherein the other candidate entity mentions are at least one candidate entity mention except the any candidate entity mention in the at least two candidate entity mentions;
determining a global semantic similarity of any candidate entity mention by using a first semantic similarity between the any candidate entity mention and the other candidate entity mentions;
and screening target entity mentions from the candidate entity mentions according to the global semantic similarity of the candidate entity mentions.
2. The method of claim 1, wherein said calculating a first semantic similarity between any of said candidate entity mentions and other candidate entity mentions comprises:
determining any candidate entity and at least one corresponding entity to be linked from an entity library;
determining semantic vectors mentioned by the candidate entities and semantic vectors of the entities to be linked;
fusing the semantic vector mentioned by any candidate entity and corresponding entity to be linked to obtain a fused vector mentioned by any candidate entity;
and calculating a first semantic similarity between any candidate entity mention and other candidate entity mentions by using the fusion vector of each candidate entity mention.
3. The method according to claim 2, wherein the fusing the semantic vector mentioned by any candidate entity with the semantic vector mentioned by any candidate entity and the semantic vector mentioned by each corresponding entity to be linked to obtain the fused vector mentioned by any candidate entity comprises:
calculating a second semantic similarity between the any candidate entity mention and each entity to be linked corresponding to the any candidate entity mention by using the semantic vector of the any candidate entity mention and the semantic vector of each entity to be linked corresponding to the any candidate entity mention;
and determining a fusion vector mentioned by any candidate entity by utilizing a second semantic similarity between any candidate entity mentioned and each entity to be linked corresponding to any candidate entity mentioned and a semantic vector of each entity to be linked corresponding to any candidate entity mentioned.
4. The method of claim 3, wherein said determining the global semantic similarity of said any candidate entity mention using the first semantic similarity between said any candidate entity mention and said other candidate entity mention comprises:
determining the global semantic similarity of any candidate entity mention by using the initial semantic similarity of each candidate entity mention and the first semantic similarity between any candidate entity mention and other candidate entity mentions.
5. The method of claim 4, further comprising:
and determining the initial semantic similarity mentioned by any candidate entity by utilizing the global semantic similarity of any candidate entity mentioned and each corresponding entity to be linked and the second semantic similarity between any candidate entity mentioned and each corresponding entity to be linked.
6. The method of claim 5, further comprising:
for any entity to be linked, calculating third semantic similarity between the any entity to be linked and other entities to be linked, wherein the other entities to be linked are at least one entity to be linked except the any entity to be linked in each candidate entity and each corresponding entity to be linked;
and determining the global semantic similarity of any entity to be linked by utilizing the initial semantic similarity of each entity to be linked and the third semantic similarity between any entity to be linked and other entities to be linked.
7. The method of claim 6, further comprising:
and determining the initial semantic similarity of any entity to be linked by using the second semantic similarity between any entity to be linked and each candidate entity corresponding to the entity to be linked.
8. The method according to any one of claims 5-7, further comprising:
determining a target link entity from the entity to be linked corresponding to the any target entity according to the global semantic similarity of the entity to be linked corresponding to the any target entity;
and linking any target entity mention with the target link entity.
9. An apparatus for entity mention identification, the apparatus comprising:
the acquisition module is used for acquiring a text to be processed;
the determining module is used for determining at least two candidate entity mentions in the text to be processed;
a calculation module, configured to calculate, for any candidate entity mention, a first semantic similarity between the any candidate entity mention and other candidate entity mentions, where the other candidate entity mentions are at least one candidate entity mention other than the any candidate entity mention in the at least two candidate entity mentions;
the determining module is further configured to determine a global semantic similarity of the any candidate entity mention by using a first semantic similarity between the any candidate entity mention and the other candidate entity mentions;
and the screening module is used for screening the target entity mention from the candidate entity mentions according to the global semantic similarity of the candidate entity mentions.
10. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction, which when executed by the processor, causes the computer device to implement the method of entity mention identification as claimed in any one of claims 1 to 8.
CN202110408050.3A 2021-04-15 2021-04-15 Entity mention identification method, device, equipment and storage medium Pending CN115221888A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110408050.3A CN115221888A (en) 2021-04-15 2021-04-15 Entity mention identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110408050.3A CN115221888A (en) 2021-04-15 2021-04-15 Entity mention identification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115221888A true CN115221888A (en) 2022-10-21

Family

ID=83604990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110408050.3A Pending CN115221888A (en) 2021-04-15 2021-04-15 Entity mention identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115221888A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665228A (en) * 2023-07-31 2023-08-29 恒生电子股份有限公司 Image processing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665228A (en) * 2023-07-31 2023-08-29 恒生电子股份有限公司 Image processing method and device
CN116665228B (en) * 2023-07-31 2023-10-13 恒生电子股份有限公司 Image processing method and device

Similar Documents

Publication Publication Date Title
CN110807361B (en) Human body identification method, device, computer equipment and storage medium
CN110222789B (en) Image recognition method and storage medium
CN111382624A (en) Action recognition method, device, equipment and readable storage medium
CN108288032B (en) Action characteristic acquisition method, device and storage medium
CN111127509B (en) Target tracking method, apparatus and computer readable storage medium
CN110059686B (en) Character recognition method, device, equipment and readable storage medium
CN112581358B (en) Training method of image processing model, image processing method and device
CN112261491B (en) Video time sequence marking method and device, electronic equipment and storage medium
CN111027490A (en) Face attribute recognition method and device and storage medium
CN112084811A (en) Identity information determining method and device and storage medium
CN111339737A (en) Entity linking method, device, equipment and storage medium
CN110705614A (en) Model training method and device, electronic equipment and storage medium
CN113918767A (en) Video clip positioning method, device, equipment and storage medium
CN112001442B (en) Feature detection method, device, computer equipment and storage medium
CN110853124B (en) Method, device, electronic equipment and medium for generating GIF dynamic diagram
CN111563201A (en) Content pushing method, device, server and storage medium
CN110837557A (en) Abstract generation method, device, equipment and medium
CN115221888A (en) Entity mention identification method, device, equipment and storage medium
CN113343709B (en) Method for training intention recognition model, method, device and equipment for intention recognition
CN113593521B (en) Speech synthesis method, device, equipment and readable storage medium
CN112989198B (en) Push content determination method, device, equipment and computer-readable storage medium
CN114817709A (en) Sorting method, device, equipment and computer readable storage medium
CN113936240A (en) Method, device and equipment for determining sample image and storage medium
CN112487162A (en) Method, device and equipment for determining text semantic information and storage medium
CN111597823A (en) Method, device and equipment for extracting central word and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination