CN112329471A - Named entity identification method, device and equipment based on matching graph - Google Patents

Named entity identification method, device and equipment based on matching graph Download PDF

Info

Publication number
CN112329471A
CN112329471A CN202110014000.7A CN202110014000A CN112329471A CN 112329471 A CN112329471 A CN 112329471A CN 202110014000 A CN202110014000 A CN 202110014000A CN 112329471 A CN112329471 A CN 112329471A
Authority
CN
China
Prior art keywords
text
information
processed
matching
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110014000.7A
Other languages
Chinese (zh)
Other versions
CN112329471B (en
Inventor
李直旭
陈志刚
陈大伟
何莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Iflytek Suzhou Technology Co Ltd
Original Assignee
Iflytek Suzhou Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Iflytek Suzhou Technology Co Ltd filed Critical Iflytek Suzhou Technology Co Ltd
Priority to CN202110014000.7A priority Critical patent/CN112329471B/en
Publication of CN112329471A publication Critical patent/CN112329471A/en
Application granted granted Critical
Publication of CN112329471B publication Critical patent/CN112329471B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a named entity identification method, a device and equipment based on matching. The invention has the conception that aiming at the problems that the input text in a specific scene has insufficient information and non-uniform forms, the invention leads some objects with indefinite meanings in the text to be difficult to be identified, introduces the attribute information of the image attached to the scene and the deep information of the text to assist the identification processing of the named entity, particularly, the invention provides the method for carrying out text level conversion on the image information, leads the image attribute, the deep information of the text and the basic information of the text to be unified, thus, when the basic text information, the deep information, the matching image attribute information and the basic image information are subjected to multi-mode comprehensive processing, on one hand, the problem of insufficient information of the input text can be made up, on the other hand, the spatial heterogeneity of the image and the text can be reduced, the text and the image can be fully interacted and combined deeply, and therefore the efficiency and the accuracy of the recognition of the named entity in the scene can be greatly improved.

Description

Named entity identification method, device and equipment based on matching graph
Technical Field
The invention relates to the field of knowledge graphs, in particular to a named entity identification method, device and equipment based on graph matching.
Background
Named Entity Recognition (NER) is a key technology for information extraction, the main task of Named Entity Recognition is to recognize entities with specific meanings in a text, mainly including names of people, places, names of organizations, proper nouns and the like, and the implementation process is generally to determine the boundary range of the entities in the input text and then determine the type labels of the entities.
At present, for texts with good sentence structures and sufficient context information, satisfactory entity identification results can be obtained based on technologies such as BilSTM + CRF or Bert + CRF. However, in some specific application fields, such as the social media field, the text to be processed has the characteristics of short text, insufficient context, colloquization, incorrect spelling, abbreviation and the like, so that the traditional named entity recognition technology cannot achieve a good enough recognition effect.
Disclosure of Invention
In view of the foregoing, the present invention aims to provide a named entity recognition method, apparatus and device based on matching graph, and accordingly provides a computer readable storage medium and a computer program product, so as to solve the problem of poor recognition effect of named entities in some specific application environments.
The technical scheme adopted by the invention is as follows:
in a first aspect, the present invention provides a named entity identification method based on matching graph, wherein the method includes:
according to a preset strategy, acquiring deep information of a text to be processed and attribute information of a matching picture attached to the text to be processed, wherein the attribute information is represented in a text form;
extracting text information of the text to be processed and visual information of the matching picture;
and carrying out named entity identification processing by combining the text information, the deep information, the attribute information and the visual information to obtain an entity type sequence of the text to be processed.
In at least one possible implementation manner, the obtaining deep information of the text to be processed according to the predetermined policy includes:
and acquiring entity knowledge information of the text to be processed according to a pre-constructed multi-mode knowledge graph, wherein the multi-mode knowledge graph comprises a plurality of entities and pictures associated with the entities.
In at least one possible implementation manner, the acquiring, according to a pre-constructed multi-modal knowledge graph, entity knowledge information of a text to be processed includes:
matching a plurality of candidate entities of the text to be processed by utilizing the multi-modal knowledge graph;
screening out target entities from the candidate entities by using the matching picture and pictures associated with the candidate entities;
and acquiring a plurality of knowledge of the target entity from the multi-modal knowledge graph to be used as entity knowledge information of the text to be processed.
In at least one possible implementation manner, the matching, by using the multi-modal knowledge-graph, a plurality of candidate entities of the text to be processed includes:
pre-constructing a nickname table corresponding to entities in the multi-modal knowledge graph;
matching the text to be processed with the entity name and the nickname table in the multi-modal knowledge graph;
and constructing the entities in the multi-modal knowledge graph and one or more hop entities thereof which meet the preset matching standard as a candidate entity set.
In at least one possible implementation manner, the obtaining of attribute information of a bitmap attached to a text to be processed includes: obtaining the type information of the matching picture expressed in the text form based on an image classification strategy.
In at least one possible implementation manner, the performing named entity recognition processing by combining the text information, the deep information, the attribute information, and the visual information includes:
obtaining text context representation of the character unit according to text information of the character unit in the text to be processed, the deep information and the attribute information of the matching graph;
obtaining the visual context representation of the character unit according to the text context representation and the visual information;
fusing the text information of the character unit, the text context representation and the visual context representation to obtain a comprehensive representation of the character unit;
and according to the comprehensive representation, carrying out entity type marking on the character unit.
In at least one possible implementation manner, the performing named entity recognition processing by combining the text information, the deep information, the attribute information, and the visual information specifically includes:
combining the text to be processed, the deep information and the attribute information, performing attention calculation on the character unit, and obtaining a first association degree between the target character unit and other character units;
performing attention calculation again by using the visual information and the first relevance to obtain a second relevance blended into the image information;
dynamically combining the first relevance and the second relevance to obtain a multi-modal context representation of the target character unit;
fusing the multi-modal context with the text information of the target character unit to obtain comprehensive representation of the target character unit;
and identifying the entity type of the target character unit by using the comprehensive representation.
In a second aspect, the present invention provides a named entity recognition apparatus based on matching graph, including:
the text-level auxiliary information acquisition module is used for acquiring deep information of a text to be processed and attribute information of a matching figure attached to the text to be processed according to a preset strategy, wherein the attribute information is represented in a text form;
the basic information extraction module is used for extracting text information of the text to be processed and visual information of the matching picture;
and the named entity identification module is used for carrying out named entity identification processing by combining the text information, the deep information, the attribute information and the visual information to obtain an entity type sequence of the text to be processed.
In at least one possible implementation manner, the text-level auxiliary information obtaining module includes:
and the entity knowledge acquisition sub-module is used for acquiring entity knowledge information of the text to be processed according to a pre-constructed multi-mode knowledge map, wherein the multi-mode knowledge map comprises a plurality of entities and pictures associated with the entities.
In at least one possible implementation manner, the entity knowledge obtaining sub-module includes:
the candidate entity matching unit is used for matching a plurality of candidate entities of the text to be processed by utilizing the multi-mode knowledge graph;
the target entity screening unit is used for screening a target entity from the candidate entities by utilizing the matching picture and pictures associated with the candidate entities;
and the entity knowledge acquisition unit is used for acquiring a plurality of knowledge of the target entity from the multi-modal knowledge map as entity knowledge information of the text to be processed.
In at least one possible implementation manner, the candidate entity matching unit includes:
an alias table construction component for pre-constructing an alias table corresponding to an entity in the multimodal knowledge graph;
the entity matching component is used for matching the text to be processed with the entity name and the alias table in the multi-mode knowledge graph;
and the candidate entity construction component is used for constructing the entities in the multi-modal knowledge graph and one or more hop entities thereof which meet the preset matching standard into a candidate entity set.
In at least one possible implementation manner, the text-level auxiliary information obtaining module includes: and the matching attribute acquisition submodule is used for acquiring the type information of the matching expressed in the text form based on the image classification strategy.
In at least one possible implementation manner, the named entity identifying module includes:
the text representation calculation unit is used for solving the text context representation of the character unit according to the text information of the character unit in the text to be processed, the deep information and the attribute information of the matching graph;
the visual representation calculation unit is used for obtaining the visual context representation of the character unit according to the text context representation and the visual information;
the multi-mode fusion unit is used for fusing the text information, the text context representation and the visual context representation of the character unit to obtain the comprehensive representation of the character unit;
and the entity type labeling unit is used for carrying out entity type labeling on the character unit according to the comprehensive representation.
In at least one possible implementation manner, the text representation calculating unit is specifically configured to:
and performing attention calculation on the character unit by combining the text to be processed, the deep information and the attribute information to obtain a first association degree between the target character unit and other character units.
In at least one possible implementation manner, the visual representation calculating unit is specifically configured to perform attention calculation again by using the visual information and the first relevance to obtain a second relevance blended into the image information.
In at least one possible implementation manner, the multi-modal fusion unit is specifically configured to:
dynamically combining the first relevance and the second relevance to obtain a multi-modal context representation of the target character unit;
and fusing the multi-modal context and the text information of the target character unit to obtain the comprehensive representation of the target character unit.
In at least one possible implementation manner, the entity type labeling unit is specifically configured to:
and identifying the entity type of the target character unit by using the comprehensive representation.
In a third aspect, the present invention provides a named entity recognition device based on matching graph, including:
one or more processors, memory which may employ a non-volatile storage medium, and one or more computer programs stored in the memory, the one or more computer programs comprising instructions which, when executed by the apparatus, cause the apparatus to perform the method as in the first aspect or any possible implementation of the first aspect.
In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon a computer program which, when run on a computer, causes the computer to perform at least the method as described in the first aspect or any of its possible implementations.
In a fifth aspect, the present invention also provides a computer program product for performing at least the method of the first aspect or any of its possible implementations, when the computer program product is executed by a computer.
In at least one possible implementation manner of the fifth aspect, the relevant program related to the product may be stored in whole or in part on a memory packaged with the processor, or may be stored in part or in whole on a storage medium not packaged with the processor.
The invention has the conception that aiming at the problems that the input text in a specific scene has insufficient information and non-uniform forms, the invention leads some objects with indefinite meanings in the text to be difficult to be identified, introduces the attribute information of the image attached to the scene and the deep information of the text to assist the identification processing of the named entity, particularly, the invention provides the method for carrying out text level conversion on the image information, leads the image attribute, the deep information of the text and the basic information of the text to be unified, thus, when the basic text information, the deep information, the matching image attribute information and the basic image information are subjected to multi-mode comprehensive processing, on one hand, the problem of insufficient information of the input text can be made up, on the other hand, the spatial heterogeneity of the image and the text can be reduced, the text and the image can be fully interacted and combined deeply, and therefore the efficiency and the accuracy of the recognition of the named entity in the scene can be greatly improved.
Furthermore, the invention provides that deep information of the text is obtained by combining the multimodal knowledge graph in other embodiments, namely, a target entity can be screened from the multimodal knowledge graph by using a matching graph on the basis of text matching, so that the knowledge information related to the input text is extracted from the multimodal knowledge graph according to the target entity, and the recognition of the named entity is assisted.
Drawings
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described with reference to the accompanying drawings, in which:
FIG. 1 is a sample illustration of a social media platform;
FIG. 2 is a flowchart of an embodiment of a named entity recognition method based on matching graph according to the present invention;
FIG. 3 is a schematic diagram of an embodiment of a multimodal knowledge-graph provided by the present invention;
FIG. 4 is a flowchart illustrating a method for acquiring entity knowledge information according to a preferred embodiment of the present invention;
FIG. 5 is a flowchart illustrating a multi-modal named entity recognition processing method according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating an embodiment of a named entity recognition apparatus based on matching graph according to the present invention;
fig. 7 is a schematic diagram of an embodiment of a named entity recognition device based on a matching graph according to the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative only and should not be construed as limiting the invention.
Before describing the embodiments of the present invention, the logical context and derivation of the inventive process of the present invention will be described. In order to solve the aforementioned problem of poor recognition effect of named entities caused by short text to be processed and insufficient information quantity in the specific field, the inventors analyzed the specific field: through research and observation of a large number of examples, taking a social media platform as an example, it can be found that a large number of sample examples provided by the application environment are all related to specific topics, such as but not limited to sports, movies, music, celebrities, tourism, and the like, and although the amount of text is limited, a plurality of matching drawings are attached to the samples to be processed in such fields, and the matching drawings are typical pictures which are highly related to the topics, such as movie posters, album covers, tourist cities, character photos, symbolic buildings, and the like.
Based on the analysis, the invention primarily considers the visual information combined with matching graph to realize the named entity recognition based on multiple modes, and if the visual information combined with the matching graph is not considered and only the text "Rocky is ready for snow search" is used for determining the entity type, the "Rocky" can be easily recognized as the "People" type incorrectly, namely, the Rocky entity is labeled as "person"; by the mapping information provided by the example, it can be known that the rough probability of "Rocky" refers to the name of the dog in the mapping, and therefore, the introduction of the attached mapping information is considered to assist the text entity identification for the named entity identification task in the specific field. However, the inventor finds, through practical analysis, that samples needing to be processed in reality are complex and various, and if the identification requirements of a large number of samples cannot be met and covered only by simple combination of pictures and texts, even though the scheme can be improved compared with a scheme of identification only depending on characters, the actually obtained entity identification result is still poor, and the reason for this is that the combination of simple pictures and texts cannot dig out deeper picture information, and the picture information is difficult to generate a deep interactive relationship with the text information sufficiently, and specifically, the inventor considers that the disadvantages of named entity identification by simply combining visual information mainly include the following two points: 1) when the image information is understood, only staying on a shallow layer of a content object presented by a matching picture, the hidden information of a deeper layer of the matching picture cannot be known, for example, the matching picture of a certain text is a movie poster, the shallow layer of visual processing can only find some people or objects in the matching picture, and the actual meaning represented by the matching picture cannot be known as the movie poster, so that some entities in the corresponding text cannot be identified or have identification deviation; 2) the text information and the image information are always in respective expression space expression semantics, and the different expression spaces of the text information and the image information have larger spatial heterogeneity, so that the text information and the image information are difficult to generate deep information interaction without simple fusion of analysis and processing, and further the final entity recognition effect is also influenced.
In view of the above, the inventor considers that, for the named entity identification problem in a specific application field, if the participation of the map matching information is involved, the foregoing two technical obstacles need to be effectively solved, and accordingly the present invention provides at least one embodiment of a named entity identification method based on map matching, and as shown in fig. 2, some embodiments may specifically include:
step S1, according to a preset strategy, obtaining deep information of the text to be processed and attribute information of matching figures attached to the text to be processed.
The deep information and the attribute information of the matching map are specific solutions designed for the first technical obstacle, that is, two basic modal information of a text and an image are not simply adopted, but the deep information of the text and the attribute information of the image are mined to form multi-modal information, so that the problems that the named entities in the text cannot be identified or are identified wrongly due to insufficient input text information amount and scattered and non-uniform forms in a specific scene can be solved; in addition, in order to overcome the second obstacle mentioned above, in this embodiment, the attribute information of the matching map is represented in a text form, in other words, part of the features of the matching map are presented in text-level information, so that the semantic gap between the text and the image can be minimized, that is, the introduced image attribute modality in the text form is used as a bridge between the image semantic expression space and the text semantic expression space, so that information interaction between multiple modalities can be completed more fully, and thus the entity recognition task for a specific scene can be completed. It should be noted that the specific scenarios contemplated by the present invention include, but are not limited to, social media platforms, and any scenarios with less text information and with text-related matching drawings may be applicable.
Regarding the acquisition of the deep information of the text, different strategies can be adopted in different embodiments to extract the required deep information, such as but not limited to syntactic and grammatical dependency relationship, TF-IDF score, topic information in natural language processing, word collocation characteristics, and multidimensional deep semantic features.
Preferably, on the basis, the present invention proposes to pre-construct a multi-modal knowledge graph including not only several entities but also pictures related to the entities, and taking fig. 3 as an example, the multi-modal knowledge graph may include a plurality of knowledge triples related to the entity "Kobe Bryant", such as: (Kobe Bryant, isA, basketball player), (Kobe Bryant, friend, Lebron James), etc., in addition to that, the multimodal knowledge map includes several pictures related to the entities "Kobe Bryant", "Lebron James", "Los Angeles", wherein the pictures image1, image2, image3 … … related to each entity are indicated by dotted lines and dotted boxes on one side of Cobbyland, Lobranhancjems, Los Angeles, respectively. Those skilled in the art will appreciate that the multimodal knowledge graph of fig. 3 is merely an example, and may include multiple knowledge triples and related pictures corresponding to other entities; secondly, when the multi-modal knowledge graph is constructed, for some specific relation triples, related pictures do not need to be set, for example, the entity "basetball player" in the example of figure 3 (Kobe Bryant, isA, basetball player) may preferably not be picture associated, because the isA expresses a conceptual relationship, the corresponding entity is a conceptual entity with a higher meaning, there is no directly related picture or it will be understood that if a related picture is constructed for a concept type entity such as a basketball player, not only is resources consuming, but it is also not too substantial for the technical task of the present invention to be concerned with, and therefore it is not necessary to associate a picture for each entity in the knowledge graph, but rather sets up several pictures of relevance for a particular entity, such as the entities in the aforementioned non-conceptual relationship triplets (residential, friends, etc. relationships) on an as-needed basis.
In addition, two points are indicated: firstly, the method is different from the traditional thought, the knowledge graph or the multi-mode knowledge graph is adopted, the conventional reasoning decoding is not carried out by the knowledge graph or the multi-mode knowledge graph, more background knowledge is injected into a small amount of text content in a specific scene, namely the knowledge graph is used for reversely helping the small amount of information text to effectively identify named entities in the text content, and particularly the text entities which are difficult to obtain accurate results based on the traditional entity identification mode; second, the role of the knowledge-graph in the present invention is to provide deep information (entity knowledge) of the text, wherein the contained entities may or may not be consistent with the named entities in the text finally recognized by the present invention, in other words, the entities in the knowledge-graph are not directly related to the final entities that the present invention needs to recognize, and the role of the entities in the knowledge-graph is only to provide "supplementary knowledge".
Specifically, based on the foregoing concept, the invention further considers in some embodiments that the required map knowledge is extracted by combining character matching and image comparison, and taking fig. 4 as an example, the manner of acquiring entity knowledge information may include the following steps:
step S11, matching a plurality of candidate entities of the text to be processed by utilizing the multi-modal knowledge graph;
step S12, screening out target entities from the candidate entities by using the matching picture and pictures associated with the candidate entities;
and step S13, acquiring a plurality of knowledge of the target entity from the multi-modal knowledge graph as entity knowledge information of the text to be processed.
In practical operation, the entity matching can be performed on the input text to be processed from the multimodal knowledge graph by using, but not limited to, a forward maximum matching algorithm, and preferably, a nickname table corresponding to part or all of the entities in the multimodal knowledge graph can be pre-constructed to ensure that no omission occurs in the character matching process. Further, in other preferred embodiments of the present invention, not only several entities matched by characters from the graph may be used as candidates, but also one-hop or multi-hop entities matched may be used as candidates to jointly construct a more comprehensive and non-missing candidate entity set with respect to the text to be processed.
Then, because the scene concerned by the invention is accompanied by matching, the matching and candidate entity in the candidate entity set can be further calculated, including the similarity of the associated pictures set by one or more than one entity, and the entity corresponding to the picture with the highest similarity can be selected from the comprehensive and non-missing candidate entities according to the established threshold criteria, and the entity is taken as the target entity. Continuing with the previous example, if the match is a picture containing the number "24", and the similarity comparison shows that the picture similarity associated with one or more hops of the candidate entity "cobra" is the highest, then the candidate entity "cobra" can be found to find the initial candidate entity "cobra", and the initial candidate entity "cobra" is determined to be the target entity.
Then, it is able to extract several knowledge information of the target entity from the multimodal knowledge graph as the deep information of the text to be processed, and it should be noted here that, because a certain entity in the knowledge graph may have a plurality of relationship triples, that is, has a plurality of relationship knowledge, it is theoretically possible not to limit the number of knowledge information of the target entity, and it is possible to splice all knowledge information of the target entity with the text to be processed for subsequent identification processing, but considering that the incorporation of more knowledge information increases the amount of computation on the one hand, and on the other hand, noise may be introduced to make the final processing effect not ideal, therefore, in some preferred embodiments of the present invention, it is proposed that only part of knowledge information with higher correlation with the named entity can be extracted, for example but not limited to the entity concept knowledge proposed by the isA relationship triples, as explained in the previous example, a (Kobe Bryant, isA, basketball player) can be extracted as the deep information of the text to be processed.
For the aforementioned case that the name of an entity (or its alias) in the knowledge graph may or may not be consistent with some entities in the input text, taking the input text "Kobe is one of the great layers in NBA history" as an example, the target entity obtained after the aforementioned processing is "Kobe Bryant", and the target entity obtained may also be "Kobe", and in any case, the relationship knowledge (for example, isA triple) corresponding to the target entity is embedded into the text to be processed as entity knowledge information, and is used as the input data containing deep information of the subsequent recognition processing link. Because the present invention aims to add deep information such as entity knowledge which can be determined to the input text to be processed, rather than directly marking the "Kobe" type of the entity in the graph (the entity type in the graph is known) to the "Kobe" in the text to be processed, the idea of extracting knowledge information of the present invention can be known to not only provide recognition assistance for the "Kobe" in the "Kobe is one of the great places in NBA history," but also play a positive feedback role in recognizing other named entities (such as "NBA" and the like) in the text to be processed. Of course, if the target entity selected from the map is exactly matched with a certain entity character in the text to be processed, the entity type result output by the subsequent named entity recognition may be checked by using the entity type of the target entity extracted here, and the present invention is not limited thereto.
In the following, regarding to the step S1, referring to obtaining the attribute information of the matching drawing, in practical operations, for example, but not limited to, semantic information, content, scene, and the like of the matching drawing described in the form of text may be obtained from the matching drawing, and the implementation means may also use corresponding techniques such as feature extraction, natural language conversion, image content analysis, and the like. In practical operation, but not limited to, an image classification model inclusion v3 or ResNet, etc. may be pre-trained on ImageNet, and the model may obtain probabilities that an arrangement diagram corresponds to a large number of predetermined categories, and in this embodiment, categories corresponding to top n probability values may be taken as attribute information of an arrangement diagram, for example, top5 type attributes of a certain arrangement diagram may be obtained as: cliff, Alp, jean, megalth, ski, and can express the five attribute information into a form consistent with the text information as auxiliary embedded information for subsequent processing. That is, the attribute value of the text form of matching image can be obtained by using the existing image classification network, and the same embedding mode as that of the input text to be processed is adopted in the subsequent identification processing link, so that the interaction is generated in the same expression space, and the semantic gap problem between the image and the text is effectively relieved.
Returning to fig. 2, step S2 is to extract the text information of the text to be processed and the visual information of the matching.
The text information and the visual information may be understood as basic features of the input text to be processed and the mapping, such as, but not limited to, dividing the text into character units and obtaining word-level and/or word-level vector expressions and basic features such as part of speech, dependency relationship, etc., and extracting basic image features such as resolution, color, bit depth, saturation, brightness, texture, semantic, etc., from the mapping. The specific extraction content and the extraction means themselves may refer to the existing mature technology, and are not described herein any more, but it should be noted that although the image attribute information represented by the text is extracted in the foregoing, the image attribute cannot completely replace the image information, and particularly, the visual information of the image still has high value and can assist the information of named entity identification, so that it is proposed in this step that the image modality of the mapping still needs to be retained.
Step S3, conducting named entity recognition processing by combining the text information, the deep information, the attribute information and the visual information to obtain an entity type sequence of the text to be processed.
The aim of this step is to combine multiple modalities so that the engagement possibilities of participating objects that have an impact in the process of identifying a named entity can be enriched. That is, there are many ways to combine the text information, the deep information, the attribute information, and the visual information, for example, the basic text information and the visual information may be processed as mentioned above, and the deep information of the text and the attribute information of the matching map are fused, and then the two-stage processing results are combined; the text information directly related to the text to be processed and the deep information may be fused, the attribute information directly related to matching and the visual information are fused, and then the two fusion results are combined, so as to match the manner, and the like, based on this, in some preferred embodiments, the invention considers that the three information in the text form are first cooperatively processed and then are fused with the visual information, and specifically, the multi-modal named entity recognition processing manner shown in fig. 5 may include the following steps:
step S31, according to the text information of the character unit in the text to be processed, the deep information and the attribute information of the matching graph, the text context representation of the character unit is obtained.
In this embodiment, a single character unit is taken as a processing object, and the character unit herein may refer to one word, one symbol, and the like. In actual operation, this step may use the existing named entity recognition mode, and adopt a model architecture such as BiLSTM + CRF or Bert + CRF, etc., and take the text to be processed, the aforementioned deep information, and the attribute information as inputs, and perform attention calculation on the character unit, to obtain the first association between the target character unit (i.e., the single object whose entity type needs to be determined) and other character units.
And step S32, obtaining the visual context representation of the character unit according to the text context representation and the visual information.
In actual operation, attention calculation can be performed again by using the visual information of the matching image and the first relevance obtained in the previous step, so as to obtain a second relevance blended with the image information, that is, to obtain the context representation with the visual information.
And step S33, fusing the text information of the character unit, the text context representation and the visual context representation to obtain the comprehensive representation of the character unit.
In practical operation, the obtained first relevance and the obtained second relevance can be dynamically combined to obtain a multi-modal context representation of the target character unit, and then the multi-modal context and the text information of the target character unit can be fused to obtain a comprehensive representation of the target character unit.
And step S34, according to the comprehensive representation, carrying out entity type marking on the character unit.
Finally, the entity type of the target character unit can be identified by using the comprehensive representation.
Since the above process is not the focus of the present invention, and the applied model framework or related algorithm can refer to the existing entity recognition technology, the foregoing process is briefly described by a specific example:
the core concept of the recognition processing is that for each character unit in the text to be processed, text information, deep information, image attributes and visual information can be integrated to form a multi-modal context information representation. Specifically, the vector embedded with the entity knowledge information and the image attribute vector may be subjected to self-attention operation together with the word-char vector corresponding to the character unit in the text to be processed. Then, for the target character unit, an alignment score of the target character unit and the context is obtained, and the alignment score can be regarded as a degree of association (first degree of association) between the target character unit and the context of the text to be processed; then, based on the image visual features extracted by using tools such as VGG and the like, calculating the cross-modal visual attention by combining the obtained alignment scores between the text character units to obtain a corresponding visual context; then, but not limited to, dynamically combining the alignment scores between the character units and the visual context by using a gating fusion mechanism, obtaining the probability distribution of a dynamic fusion result through a softmax layer in a model architecture, and combining the probability distribution with the embedded representation of each character unit to perform weighted summation operation to obtain a multi-modal context finally blended into text, images, image attributes and deep knowledge; then, word-char embedding of the target character unit can be preferably fused to obtain final comprehensive vector representation containing multi-modal information; finally, the comprehensive vector can be input to a CRF layer through two fully-connected layers in the model architecture to complete the labeling task of the entity type sequence.
In conclusion, the invention has the conception that aiming at the situation that the input text in a specific scene has insufficient information and non-uniform forms, the invention leads some objects with indefinite meanings in the text to be difficult to be identified, introduces the attribute information of the image attached to the scene and the deep information of the text to assist the identification processing of the named entity, particularly, the invention provides the method for carrying out text level conversion on the image information, leads the image attribute, the deep information of the text and the basic information of the text to be unified, thus, when the basic text information, the deep information, the matching image attribute information and the basic image information are subjected to multi-mode comprehensive processing, on one hand, the problem of insufficient information of the input text can be made up, on the other hand, the spatial heterogeneity of the image and the text can be reduced, the text and the image can be fully interacted and combined deeply, and therefore the efficiency and the accuracy of the recognition of the named entity in the scene can be greatly improved.
Corresponding to the above embodiments and preferred solutions, the present invention further provides an embodiment of a named entity recognition apparatus based on a matching graph, as shown in fig. 6, which may specifically include the following components:
the text-level auxiliary information acquisition module 1 is used for acquiring deep information of a text to be processed and attribute information of a matching figure attached to the text to be processed according to a preset strategy, wherein the attribute information is represented in a text form;
the basic information extraction module 2 is used for extracting text information of the text to be processed and visual information of the matching picture;
and the named entity identification module 3 is used for carrying out named entity identification processing by combining the text information, the deep information, the attribute information and the visual information to obtain an entity type sequence of the text to be processed.
In at least one possible implementation manner, the text-level auxiliary information obtaining module includes:
and the entity knowledge acquisition sub-module is used for acquiring entity knowledge information of the text to be processed according to a pre-constructed multi-mode knowledge map, wherein the multi-mode knowledge map comprises a plurality of entities and pictures associated with the entities.
In at least one possible implementation manner, the entity knowledge obtaining sub-module includes:
the candidate entity matching unit is used for matching a plurality of candidate entities of the text to be processed by utilizing the multi-mode knowledge graph;
the target entity screening unit is used for screening a target entity from the candidate entities by utilizing the matching picture and pictures associated with the candidate entities;
and the entity knowledge acquisition unit is used for acquiring a plurality of knowledge of the target entity from the multi-modal knowledge map as entity knowledge information of the text to be processed.
In at least one possible implementation manner, the candidate entity matching unit includes:
an alias table construction component for pre-constructing an alias table corresponding to an entity in the multimodal knowledge graph;
the entity matching component is used for matching the text to be processed with the entity name and the alias table in the multi-mode knowledge graph;
and the candidate entity construction component is used for constructing the entities in the multi-modal knowledge graph and one or more hop entities thereof which meet the preset matching standard into a candidate entity set.
In at least one possible implementation manner, the text-level auxiliary information obtaining module includes: and the matching attribute acquisition submodule is used for acquiring the type information of the matching expressed in the text form based on the image classification strategy.
In at least one possible implementation manner, the named entity identifying module includes:
the text representation calculation unit is used for solving the text context representation of the character unit according to the text information of the character unit in the text to be processed, the deep information and the attribute information of the matching graph;
the visual representation calculation unit is used for obtaining the visual context representation of the character unit according to the text context representation and the visual information;
the multi-mode fusion unit is used for fusing the text information, the text context representation and the visual context representation of the character unit to obtain the comprehensive representation of the character unit;
and the entity type labeling unit is used for carrying out entity type labeling on the character unit according to the comprehensive representation.
In at least one possible implementation manner, the text representation calculating unit is specifically configured to:
and performing attention calculation on the character unit by combining the text to be processed, the deep information and the attribute information to obtain a first association degree between the target character unit and other character units.
In at least one possible implementation manner, the visual representation calculating unit is specifically configured to perform attention calculation again by using the visual information and the first relevance to obtain a second relevance blended into the image information.
In at least one possible implementation manner, the multi-modal fusion unit is specifically configured to:
dynamically combining the first relevance and the second relevance to obtain a multi-modal context representation of the target character unit;
and fusing the multi-modal context and the text information of the target character unit to obtain the comprehensive representation of the target character unit.
In at least one possible implementation manner, the entity type labeling unit is specifically configured to:
and identifying the entity type of the target character unit by using the comprehensive representation.
It should be understood that the division of each component in the assignment diagram-based named entity recognition apparatus shown in fig. 6 is merely a logical division, and the actual implementation may be wholly or partially integrated into one physical entity or physically separated. And these components may all be implemented in software invoked by a processing element; or may be implemented entirely in hardware; and part of the components can be realized in the form of calling by the processing element in software, and part of the components can be realized in the form of hardware. For example, a certain module may be a separate processing element, or may be integrated into a certain chip of the electronic device. Other components are implemented similarly. In addition, all or part of the components can be integrated together or can be independently realized. In implementation, each step of the above method or each component above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
For example, the above components may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), one or more microprocessors (DSPs), one or more Field Programmable Gate Arrays (FPGAs), etc. For another example, these components may be integrated together and implemented in the form of a System-On-a-Chip (SOC).
In view of the foregoing examples and preferred embodiments thereof, it will be appreciated by those skilled in the art that, in practice, the technical idea underlying the present invention may be applied in a variety of embodiments, the present invention being schematically illustrated by the following vectors:
(1) a named entity recognition device based on matching graph. The device may specifically include: one or more processors, memory, and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions, which when executed by the apparatus, cause the apparatus to perform the steps/functions of the foregoing embodiments or an equivalent implementation.
Fig. 7 is a schematic structural diagram of an embodiment of a named entity recognition device based on a matching graph provided in the present invention, where the device may be a cloud server, a computer of a related platform, an intelligent terminal, or the like.
As shown in particular in fig. 7, the match diagram-based named entity recognition device 900 includes a processor 910 and a memory 930. Wherein, the processor 910 and the memory 930 can communicate with each other and transmit control and/or data signals through the internal connection path, the memory 930 is used for storing computer programs, and the processor 910 is used for calling and running the computer programs from the memory 930. The processor 910 and the memory 930 may be combined into a single processing device, or more generally, separate components, and the processor 910 is configured to execute the program code stored in the memory 930 to implement the functions described above. In particular implementations, the memory 930 may be integrated with the processor 910 or may be separate from the processor 910.
In addition to this, to further improve the functionality of the mapping-based named entity recognition device 900, the device 900 may further comprise one or more of an input unit 960, a display unit 970, an audio circuit 980, a camera 990, a sensor 901, etc., which may further comprise a speaker 982, a microphone 984, etc. The display unit 970 may include a display screen, among others.
Further, the apparatus 900 may also include a power supply 950 for providing power to various devices or circuits within the apparatus 900.
It should be understood that the operation and/or function of the various components of the apparatus 900 can be referred to in the foregoing description with respect to the method, system, etc., and the detailed description is omitted here as appropriate to avoid repetition.
It should be understood that the processor 910 in the named entity recognition device 900 based on a match graph shown in fig. 7 may be a system on a chip SOC, and the processor 910 may include a Central Processing Unit (CPU), and may further include other types of processors, for example: an image Processing Unit (GPU), etc., which will be described in detail later.
In summary, various portions of the processors or processing units within the processor 910 may cooperate to implement the foregoing method flows, and corresponding software programs for the various portions of the processors or processing units may be stored in the memory 930.
(2) A readable storage medium, on which a computer program or the above-mentioned apparatus is stored, which, when executed, causes the computer to perform the steps/functions of the above-mentioned embodiments or equivalent implementations.
In the several embodiments provided by the present invention, any function, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a computer readable storage medium. Based on this understanding, some aspects of the present invention may be embodied in the form of software products, which are described below, or portions thereof, which substantially contribute to the art.
(3) A computer program product (which may include the above-described apparatus) that, when run on a terminal device, causes the terminal device to perform the mapping-based named entity recognition method of the preceding embodiment or an equivalent.
From the above description of the embodiments, it is clear to those skilled in the art that all or part of the steps in the above implementation method can be implemented by software plus a necessary general hardware platform. With this understanding, the above-described computer program products may include, but are not limited to, refer to APP; in the foregoing, the device/terminal may be a computer device, and the hardware structure of the computer device may further specifically include: at least one processor, at least one communication interface, at least one memory, and at least one communication bus; the processor, the communication interface and the memory can all complete mutual communication through the communication bus. The processor may be a central Processing unit CPU, a DSP, a microcontroller, or a digital Signal processor, and may further include a GPU, an embedded Neural Network Processor (NPU), and an Image Signal Processing (ISP), and may further include a specific integrated circuit ASIC, or one or more integrated circuits configured to implement the embodiments of the present invention, and the processor may have a function of operating one or more software programs, and the software programs may be stored in a storage medium such as a memory; and the aforementioned memory/storage media may comprise: non-volatile memories (non-volatile memories) such as non-removable magnetic disks, U-disks, removable hard disks, optical disks, etc., and Read-Only memories (ROM), Random Access Memories (RAM), etc.
In the embodiments of the present invention, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, and means that there may be three relationships, for example, a and/or B, and may mean that a exists alone, a and B exist simultaneously, and B exists alone. Wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" and similar expressions refer to any combination of these items, including any combination of singular or plural items. For example, at least one of a, b, and c may represent: a, b, c, a and b, a and c, b and c or a and b and c, wherein a, b and c can be single or multiple.
Those of skill in the art will appreciate that the various modules, elements, and method steps described in the embodiments disclosed in this specification can be implemented as electronic hardware, combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In addition, the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other. In particular, for embodiments of devices, apparatuses, etc., since they are substantially similar to the method embodiments, reference may be made to some of the descriptions of the method embodiments for their relevant points. The above-described embodiments of devices, apparatuses, etc. are merely illustrative, and modules, units, etc. described as separate components may or may not be physically separate, and may be located in one place or distributed in multiple places, for example, on nodes of a system network. Some or all of the modules and units can be selected according to actual needs to achieve the purpose of the above-mentioned embodiment. Can be understood and carried out by those skilled in the art without inventive effort.
The structure, features and effects of the present invention have been described in detail with reference to the embodiments shown in the drawings, but the above embodiments are merely preferred embodiments of the present invention, and it should be understood that technical features related to the above embodiments and preferred modes thereof can be reasonably combined and configured into various equivalent schemes by those skilled in the art without departing from and changing the design idea and technical effects of the present invention; therefore, the invention is not limited to the embodiments shown in the drawings, and all the modifications and equivalent embodiments that can be made according to the idea of the invention are within the scope of the invention as long as they are not beyond the spirit of the description and the drawings.

Claims (9)

1. A named entity recognition method based on matching graph is characterized by comprising the following steps:
according to a preset strategy, acquiring deep information of a text to be processed and attribute information of a matching picture attached to the text to be processed, wherein the attribute information is represented in a text form; the acquiring of the deep information of the text to be processed comprises: acquiring entity knowledge information of a text to be processed according to a pre-constructed multi-modal knowledge map, wherein the multi-modal knowledge map comprises a plurality of entities and pictures associated with the entities;
extracting text information of the text to be processed and visual information of the matching picture;
and carrying out named entity identification processing by combining the text information, the deep information, the attribute information and the visual information to obtain an entity type sequence of the text to be processed.
2. The named entity recognition method of claim 1, wherein the obtaining entity knowledge information of the text to be processed according to a pre-constructed multi-modal knowledge graph comprises:
matching a plurality of candidate entities of the text to be processed by utilizing the multi-modal knowledge graph;
screening out target entities from the candidate entities by using the matching picture and pictures associated with the candidate entities;
and acquiring a plurality of knowledge of the target entity from the multi-modal knowledge graph to be used as entity knowledge information of the text to be processed.
3. The named entity recognition method of claim 2, wherein matching a number of candidate entities of text to be processed using the multimodal knowledge graph comprises:
pre-constructing a nickname table corresponding to entities in the multi-modal knowledge graph;
matching the text to be processed with the entity name and the nickname table in the multi-modal knowledge graph;
and constructing the entities in the multi-modal knowledge graph and one or more hop entities thereof which meet the preset matching standard as a candidate entity set.
4. The named entity recognition method of claim 1, wherein obtaining attribute information of a bitmap attached to the text to be processed comprises: obtaining the type information of the matching picture expressed in the text form based on an image classification strategy.
5. The named entity recognition method of any one of claims 1 to 4, wherein the named entity recognition processing in combination with the text information, the deep information, the attribute information, and the visual information comprises:
obtaining text context representation of the character unit according to text information of the character unit in the text to be processed, the deep information and the attribute information of the matching graph;
obtaining the visual context representation of the character unit according to the text context representation and the visual information;
fusing the text information of the character unit, the text context representation and the visual context representation to obtain a comprehensive representation of the character unit;
and according to the comprehensive representation, carrying out entity type marking on the character unit.
6. The named entity recognition method of claim 5, wherein the conducting named entity recognition processing in combination with the textual information, the deep information, the attribute information, and the visual information specifically comprises:
combining the text to be processed, the deep information and the attribute information, performing attention calculation on the character unit, and obtaining a first association degree between the target character unit and other character units;
performing attention calculation again by using the visual information and the first relevance to obtain a second relevance blended into the image information;
dynamically combining the first relevance and the second relevance to obtain a multi-modal context representation of the target character unit;
fusing the multi-modal context with the text information of the target character unit to obtain comprehensive representation of the target character unit;
and identifying the entity type of the target character unit by using the comprehensive representation.
7. A named entity recognition apparatus based on matching graph, comprising:
the text-level auxiliary information acquisition module is used for acquiring deep information of a text to be processed and attribute information of a matching figure attached to the text to be processed according to a preset strategy, wherein the attribute information is represented in a text form; the acquiring of the deep information of the text to be processed comprises: acquiring entity knowledge information of a text to be processed according to a pre-constructed multi-modal knowledge map, wherein the multi-modal knowledge map comprises a plurality of entities and pictures associated with the entities;
the basic information extraction module is used for extracting text information of the text to be processed and visual information of the matching picture;
and the named entity identification module is used for carrying out named entity identification processing by combining the text information, the deep information, the attribute information and the visual information to obtain an entity type sequence of the text to be processed.
8. The named entity recognition device of claim 7, wherein the named entity recognition module comprises:
the text representation calculation unit is used for solving the text context representation of the character unit according to the text information of the character unit in the text to be processed, the deep information and the attribute information of the matching graph;
the visual representation calculation unit is used for obtaining the visual context representation of the character unit according to the text context representation and the visual information;
the multi-mode fusion unit is used for fusing the text information, the text context representation and the visual context representation of the character unit to obtain the comprehensive representation of the character unit;
and the entity type labeling unit is used for carrying out entity type labeling on the character unit according to the comprehensive representation.
9. A named entity recognition device based on a matching graph, comprising:
one or more processors, memory, and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions which, when executed by the apparatus, cause the apparatus to perform the atlas-based named entity identification method of any of claims 1-6.
CN202110014000.7A 2021-01-06 2021-01-06 Named entity identification method, device and equipment based on matching graph Active CN112329471B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110014000.7A CN112329471B (en) 2021-01-06 2021-01-06 Named entity identification method, device and equipment based on matching graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110014000.7A CN112329471B (en) 2021-01-06 2021-01-06 Named entity identification method, device and equipment based on matching graph

Publications (2)

Publication Number Publication Date
CN112329471A true CN112329471A (en) 2021-02-05
CN112329471B CN112329471B (en) 2021-04-20

Family

ID=74302264

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110014000.7A Active CN112329471B (en) 2021-01-06 2021-01-06 Named entity identification method, device and equipment based on matching graph

Country Status (1)

Country Link
CN (1) CN112329471B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113779934A (en) * 2021-08-13 2021-12-10 远光软件股份有限公司 Multi-modal information extraction method, device, equipment and computer-readable storage medium
CN114386422A (en) * 2022-01-14 2022-04-22 淮安市创新创业科技服务中心 Intelligent aid decision-making method and device based on enterprise pollution public opinion extraction
CN114444593A (en) * 2022-01-25 2022-05-06 中国电子科技集团公司电子科学研究院 Multi-mode event detection method and device
CN116665228A (en) * 2023-07-31 2023-08-29 恒生电子股份有限公司 Image processing method and device
US20230281331A1 (en) * 2022-03-03 2023-09-07 Fujitsu Limited Control method and information processing apparatus

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105786836A (en) * 2014-12-22 2016-07-20 北京奇虎科技有限公司 Method and system for generating structured abstract of video webpage
US20180341863A1 (en) * 2017-05-27 2018-11-29 Ricoh Company, Ltd. Knowledge graph processing method and device
US20190108286A1 (en) * 2017-10-05 2019-04-11 Wayblazer, Inc. Concept networks and systems and methods for the creation, update and use of same to select images, including the selection of images corresponding to destinations in artificial intelligence systems
CN110674388A (en) * 2018-07-03 2020-01-10 百度在线网络技术(北京)有限公司 Mapping method and device for push item, storage medium and terminal equipment
US10726314B2 (en) * 2016-08-11 2020-07-28 International Business Machines Corporation Sentiment based social media comment overlay on image posts
CN111538845A (en) * 2020-04-03 2020-08-14 肾泰网健康科技(南京)有限公司 Method, model and system for constructing kidney disease specialized medical knowledge map
CN111816301A (en) * 2020-07-07 2020-10-23 平安科技(深圳)有限公司 Medical inquiry assisting method, device, electronic equipment and medium
CN111984771A (en) * 2020-07-17 2020-11-24 北京欧应信息技术有限公司 Automatic inquiry system based on intelligent conversation
CN112185520A (en) * 2020-09-27 2021-01-05 志诺维思(北京)基因科技有限公司 Text structured processing system and method for medical pathology report picture

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105786836A (en) * 2014-12-22 2016-07-20 北京奇虎科技有限公司 Method and system for generating structured abstract of video webpage
US10726314B2 (en) * 2016-08-11 2020-07-28 International Business Machines Corporation Sentiment based social media comment overlay on image posts
US20180341863A1 (en) * 2017-05-27 2018-11-29 Ricoh Company, Ltd. Knowledge graph processing method and device
US20190108286A1 (en) * 2017-10-05 2019-04-11 Wayblazer, Inc. Concept networks and systems and methods for the creation, update and use of same to select images, including the selection of images corresponding to destinations in artificial intelligence systems
CN110674388A (en) * 2018-07-03 2020-01-10 百度在线网络技术(北京)有限公司 Mapping method and device for push item, storage medium and terminal equipment
CN111538845A (en) * 2020-04-03 2020-08-14 肾泰网健康科技(南京)有限公司 Method, model and system for constructing kidney disease specialized medical knowledge map
CN111816301A (en) * 2020-07-07 2020-10-23 平安科技(深圳)有限公司 Medical inquiry assisting method, device, electronic equipment and medium
CN111984771A (en) * 2020-07-17 2020-11-24 北京欧应信息技术有限公司 Automatic inquiry system based on intelligent conversation
CN112185520A (en) * 2020-09-27 2021-01-05 志诺维思(北京)基因科技有限公司 Text structured processing system and method for medical pathology report picture

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113779934A (en) * 2021-08-13 2021-12-10 远光软件股份有限公司 Multi-modal information extraction method, device, equipment and computer-readable storage medium
CN113779934B (en) * 2021-08-13 2024-04-26 远光软件股份有限公司 Multi-mode information extraction method, device, equipment and computer readable storage medium
CN114386422A (en) * 2022-01-14 2022-04-22 淮安市创新创业科技服务中心 Intelligent aid decision-making method and device based on enterprise pollution public opinion extraction
CN114386422B (en) * 2022-01-14 2023-09-15 淮安市创新创业科技服务中心 Intelligent auxiliary decision-making method and device based on enterprise pollution public opinion extraction
CN114444593A (en) * 2022-01-25 2022-05-06 中国电子科技集团公司电子科学研究院 Multi-mode event detection method and device
US20230281331A1 (en) * 2022-03-03 2023-09-07 Fujitsu Limited Control method and information processing apparatus
CN116665228A (en) * 2023-07-31 2023-08-29 恒生电子股份有限公司 Image processing method and device
CN116665228B (en) * 2023-07-31 2023-10-13 恒生电子股份有限公司 Image processing method and device

Also Published As

Publication number Publication date
CN112329471B (en) 2021-04-20

Similar Documents

Publication Publication Date Title
CN112329471B (en) Named entity identification method, device and equipment based on matching graph
CN111079444B (en) Network rumor detection method based on multi-modal relationship
CN113283551B (en) Training method and training device of multi-mode pre-training model and electronic equipment
CN111488931B (en) Article quality evaluation method, article recommendation method and corresponding devices
CN114342353B (en) Method and system for video segmentation
CN111626362B (en) Image processing method, device, computer equipment and storage medium
CN112836487B (en) Automatic comment method and device, computer equipment and storage medium
CN113627447A (en) Label identification method, label identification device, computer equipment, storage medium and program product
CN111783903B (en) Text processing method, text model processing method and device and computer equipment
CN115223020B (en) Image processing method, apparatus, device, storage medium, and computer program product
CN113642536B (en) Data processing method, computer device and readable storage medium
CN113553418B (en) Visual dialogue generation method and device based on multi-modal learning
CN116955707A (en) Content tag determination method, device, equipment, medium and program product
CN114418032A (en) Five-modal commodity pre-training method and retrieval system based on self-coordination contrast learning
Han et al. Adversarial multi-grained embedding network for cross-modal text-video retrieval
CN112861474B (en) Information labeling method, device, equipment and computer readable storage medium
Liu et al. A multimodal approach for multiple-relation extraction in videos
US20230359651A1 (en) Cross-modal search method and related device
CN117786058A (en) Method for constructing multi-mode large model knowledge migration framework
CN110851629A (en) Image retrieval method
CN116186312A (en) Multi-mode data enhancement method for data sensitive information discovery model
CN117828137A (en) Data query method, device, storage medium and program product
CN112381162B (en) Information point identification method and device and electronic equipment
Wu et al. [Retracted] Research on Multimodal Image Fusion Target Detection Algorithm Based on Generative Adversarial Network
CN110969187B (en) Semantic analysis method for map migration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant