CN116258138B - Knowledge base construction method, entity linking method, device and equipment - Google Patents

Knowledge base construction method, entity linking method, device and equipment Download PDF

Info

Publication number
CN116258138B
CN116258138B CN202310269188.9A CN202310269188A CN116258138B CN 116258138 B CN116258138 B CN 116258138B CN 202310269188 A CN202310269188 A CN 202310269188A CN 116258138 B CN116258138 B CN 116258138B
Authority
CN
China
Prior art keywords
organization
entity
determining
candidate
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310269188.9A
Other languages
Chinese (zh)
Other versions
CN116258138A (en
Inventor
徐思琪
夏志群
龚建
孙珂
卓泽城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202310269188.9A priority Critical patent/CN116258138B/en
Publication of CN116258138A publication Critical patent/CN116258138A/en
Application granted granted Critical
Publication of CN116258138B publication Critical patent/CN116258138B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • G06F16/287Visualization; Browsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition

Abstract

The disclosure provides a knowledge base construction method, an entity linking device, a storage medium and a program product, and relates to the technical field of data processing, in particular to the technical field of big data and intelligent searching. The specific implementation scheme is as follows: determining an organization entity according to the organization data, wherein the organization entity has a custom attribute, and the custom attribute characterizes organization characteristics of the organization entity; and constructing an organization knowledge base according to the organization entity.

Description

Knowledge base construction method, entity linking method, device and equipment
Technical Field
The disclosure relates to the technical field of data processing, in particular to the technical field of big data and intelligent search, and specifically relates to a knowledge base construction method, an entity linking device, equipment, a storage medium and a program product.
Background
A knowledge base is a database for knowledge management that can be used for collection, arrangement and extraction of knowledge in the relevant fields. How to accurately and efficiently extract knowledge is a technical problem to be solved.
Disclosure of Invention
The present disclosure provides a knowledge base construction method, an entity linking method, an apparatus, a device, a storage medium, and a program product.
According to an aspect of the present disclosure, there is provided a knowledge base construction method, including: determining an organization entity according to the organization data, wherein the organization entity has a custom attribute, and the custom attribute characterizes organization characteristics of the organization entity; and constructing an organization knowledge base according to the organization entity.
According to another aspect of the present disclosure, there is provided an entity linking method, including: determining a reference text in the input text; determining candidate entities related to the reference text from the organization knowledge base according to the reference text; determining a target entity linked with the reference text according to the correlation between the candidate entity and the reference text, wherein the organization knowledge base is constructed by the following operations: determining an organization entity according to the organization data, wherein the organization entity has a custom attribute, and the custom attribute characterizes organization characteristics of the organization entity; and constructing an organization knowledge base according to the organization entity.
According to another aspect of the present disclosure, there is provided a knowledge base construction apparatus, including: the organization entity determining module is used for determining an organization entity according to the organization data, wherein the organization entity has a custom attribute, and the custom attribute characterizes the organization characteristics of the organization entity; and the knowledge base construction module is used for constructing the organization knowledge base according to the organization entity.
According to another aspect of the present disclosure, there is provided an entity linking apparatus, including: the reference text determining module is used for determining reference texts in the input texts; the candidate entity determining module is used for determining candidate entities related to the reference text from the organization knowledge base according to the reference text; the target entity determining module is used for determining a target entity linked with the reference text according to the candidate entity, wherein the organization knowledge base is constructed by using the following modules: the knowledge base construction module is used for the organization entity determination module to determine the organization entity according to the organization data, wherein the organization entity has a custom attribute, and the custom attribute characterizes the organization characteristics of the organization entity; and constructing an organization knowledge base according to the organization entity.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor and a memory communicatively coupled to the at least one processor. Wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the methods of the embodiments of the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of the embodiments of the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program stored on at least one of a readable storage medium and an electronic device, the computer program when executed by a processor implementing a method of an embodiment of the present disclosure.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1A schematically illustrates a particular example of searching at a search engine based on input query content;
FIG. 1B schematically illustrates a schematic diagram of ranking candidate entities to determine a target entity using a ranking model, according to one embodiment;
FIG. 1C schematically illustrates a schematic diagram of ranking candidate entities to determine a target entity using a bi-classification model, according to one embodiment;
FIG. 1D schematically illustrates a schematic diagram of ranking candidate entities to determine a target entity using a multi-classification model, according to one embodiment;
FIG. 2 schematically illustrates a system architecture diagram of a knowledge base construction method, an entity linking method and apparatus, in accordance with an embodiment of the disclosure;
FIG. 3 schematically illustrates a flow chart of a knowledge base construction method, in accordance with an embodiment of the disclosure;
FIG. 4 schematically illustrates a schematic diagram of an entity linking method according to an embodiment of the present disclosure;
FIG. 5 schematically illustrates a schematic diagram of an entity linking method according to another embodiment of the present disclosure;
FIG. 6 schematically illustrates a block diagram of a knowledge base construction apparatus, in accordance with an embodiment of the disclosure;
FIG. 7 schematically illustrates a block diagram of an entity linking apparatus, according to an embodiment of the present disclosure; and
Fig. 8 schematically illustrates a block diagram of an electronic device in which a knowledge base construction method, an entity linking method, of an embodiment of the disclosure may be implemented.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
A knowledge base is a database for knowledge management that can be used for collection, arrangement and extraction of knowledge in the relevant fields.
Entity Linking (EL for short) is a knowledge extraction mode, and the Entity Linking can be understood as a process of unambiguously pointing identified Entity objects (e.g. person names, place names, organization names, etc.) in a free text to a target Entity in a knowledge base, that is, matching the Entity object in the free text to the target Entity most conforming to the Entity in the knowledge base, and if the corresponding target Entity can be accurately queried, pushing the specific content of the target Entity. Entity links play an important role in the fields of knowledge engineering and data mining, and are the basis of various downstream applications such as knowledge fusion, content analysis, knowledge indexing and the like.
Fig. 1A schematically illustrates a specific example of searching at a search engine based on input query content (query content). As shown in fig. 1A. Entities in the knowledge base 102 may be matched according to the input query content, for example, the specific content 101 of the matched Entity-1, entity-2, entity-3, and Entity-4 may be pushed. The specific content corresponding to the Entity-1 is, for example, text-1, entity-2 corresponds to Text-2, entity-3 corresponds to Text-3, and Entity-4 corresponds to Text-4, which will not be described in detail herein.
As shown in fig. 1A, for example, according to the fact that the entities matched with the query content also have an association relationship, in the example of fig. 1, the association relationship between the Entity-1 and the Entity-2 is represented by the Edge1, and the descriptions of the edges Edge2 to Edge5 are omitted here.
The entity linking specifically involves the following processes: determining candidate entities from a knowledge base according to the reference text of the input text; the candidate entities are ranked to determine target entities that match the reference text.
FIG. 1B schematically illustrates a schematic diagram of ranking candidate entities to determine a target entity using a ranking model, according to one embodiment. Namely, by calculating the similarity between the input text (text to be disambiguated) and the entities in the knowledge base, respectively modeling the input text and the entities in the knowledge base through a neural network and obtaining respective vector representations, then scoring the matching degree through a similarity measurement method, and selecting the entity with the highest score as a target entity.
FIG. 1C schematically illustrates a schematic diagram of ranking candidate entities to determine a target entity using a classification model, according to one embodiment. That is, the reference text in the input text is encoded as a vector CLS (Special Classification Embedding, a vector for classification, a vector representation of all classification information, typically an overall sequence) and combined with the entities in the knowledge base two by two to form samples, and if the entities in the combination agree with the labels in the training data, the current combination is taken as a positive sample, and the other combinations of the reference text are taken as negative samples, which are classified by a model such as ERNIE/BERT.
FIG. 1D schematically illustrates a schematic diagram of ranking candidate entities to determine a target entity using a multi-classification model, according to one embodiment. Namely, an input text and a description text of an entity to be disambiguated are respectively input into models such as ERNIE/BERT, an output vector of a coding vector CLS position of the input text and an output vector of a coding vector CLS position of the entity text are connected together to obtain a vector representation of the entity, and finally classification is performed through a Dropout network layer and a Full Connection layer (FC).
Fig. 2 schematically illustrates a system architecture of a knowledge base construction method, an entity linking method and an apparatus according to an embodiment of the present disclosure. It should be noted that fig. 2 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios.
As shown in fig. 2, a system architecture 200 according to this embodiment may include clients 201, 202, 203, a network 204, a first server 205, and a second server 206. The network 204 is used as a medium to provide communication links between the clients 201, 202, 203, the first server 205, and the second server 206. The network 204 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the first server 205, the second server 206, through the network 204 using the clients 201, 202, 203 to receive or send messages, etc. Various communication client applications may be installed on clients 201, 202, 203, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, and the like (by way of example only).
The clients 201, 202, 203 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like. The clients 201, 202, 203 of the disclosed embodiments may, for example, run applications.
The first server 205, the second server 206 may be servers providing various services, such as a background management server (by way of example only) providing support for websites browsed by users using clients 201, 202, 203. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the client. In addition, the first server 205 and the second server 206 may be cloud servers, that is, the server 205 has a cloud computing function.
Illustratively, a first server 205 may be used to build an organization knowledge base, a second server 206 may be used for entity linking, and the second server 206 may, for example, obtain input text from clients 201, 202, 203 and the organization knowledge base built by the first server 205.
The organization knowledge base and the execution entity links may also be constructed using the same server, for example.
It should be noted that the knowledge base construction method provided by the embodiment of the present disclosure may be executed by the first server 205. Accordingly, the knowledge base construction apparatus provided by the embodiments of the present disclosure may be disposed in the first server 205. The knowledge base construction method provided by the embodiments of the present disclosure may also be performed by a server or a cluster of servers that are different from the first server 205 and that are capable of communicating with the clients 201, 202, 203 and/or the first server 205. Accordingly, the knowledge base construction apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster different from the first server 205 and capable of communicating with the clients 201, 202, 203 and/or the first server 205. The knowledge base construction method provided by the embodiment of the present disclosure may be performed by the first server 205. Accordingly, the knowledge base construction apparatus provided by the embodiments of the present disclosure may be disposed in the first server 205. The entity linking method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the second server 206 and capable of communicating with the first server 205. Accordingly, the knowledge base construction apparatus provided by the embodiments of the present disclosure may also be disposed in a server or a server cluster that is different from the second server 206 and is capable of communicating with the first server 205.
It should be understood that the number of clients, networks, first servers, and second servers in fig. 2 are merely illustrative. There may be any number of clients, networks, first servers, and second servers, as desired for implementation.
It should be noted that, in the technical solution of the present disclosure, the processes of collecting, storing, using, processing, transmitting, providing, disclosing, etc. related personal information of the user all conform to the rules of the related laws and regulations, and do not violate the public welfare.
In the technical scheme of the disclosure, the authorization or consent of the user is obtained before the personal information of the user is obtained or acquired.
The embodiment of the present disclosure provides a knowledge base construction method, and a knowledge base construction method according to an exemplary embodiment of the present disclosure is described below with reference to fig. 3 in conjunction with the system architecture of fig. 2. The knowledge base construction method of the embodiments of the present disclosure may be performed by the first server 205 shown in fig. 2, for example.
Fig. 3 schematically illustrates a flowchart of a knowledge base construction method, according to an embodiment of the present disclosure.
As shown in fig. 3, the knowledge base construction method 300 of the embodiment of the present disclosure may include, for example, operations S310 to S320.
In operation S310, an organization entity is determined from the organization data.
The organizational structure entity has custom attributes that characterize the organizational structure characteristics of the organizational structure entity.
Organization is understood to be a social system in which organization develops, perfects to some extent, forms a structure within it that is rigorous, relatively independent, and transfers or converts energy, substances, and information to one another. Organization is also a common entity in the field of natural language processing as an entity with a specific meaning. It should be noted that, in the embodiments of the present disclosure, the "organization entity" and the "organization attribute" refer to data that characterizes the organization entity and data that characterizes the attribute of the organization entity.
The organization data may be obtained, for example, by a corresponding organization administration, and in particular, the organization data may be obtained, for example, by a website, interface, etc. provided by the corresponding organization administration. The organization administration departments include, for example, an organization uniform social credit code data service center, a related standardized administration department, a registration administration department, and the like.
In operation S320, an organization knowledge base is constructed from the organization entities.
For example, the organization entities may be stored in a database, implementing specific examples of building a knowledge base of organization structures from the organization entities.
According to the knowledge base construction method, the organization mechanism entity is determined according to the organization mechanism data, and the organization mechanism entity has the custom attribute which characterizes the organization mechanism characteristics of the organization mechanism entity, so that the attribute of the organization mechanism entity is richer, the knowledge base constructed according to the organization mechanism entity is richer, and the follow-up accurate and efficient knowledge extraction is facilitated.
The description will be given below taking an example in which an organization knowledge base constructed by a knowledge base construction method according to an embodiment of the present disclosure is used for entity linking, but the organization knowledge base obtained by the knowledge base construction method according to an embodiment of the present disclosure is not limited to being subjected to entity linking.
Illustratively, the custom attributes may include at least one of descriptive attributes, pronunciation attributes.
The descriptive attribute is used for characterizing at least one of functional information, brand information, product information, field information, address information, organization name abbreviation, organization name holonomy, and persona information related to the organization entity.
The pronunciation attribute is used to characterize the pronunciation of the organization name. For example, the pronunciation attribute may be characterized by pinyin, for example.
By way of example, descriptive attributes such as brand information, product information, domain information, etc. associated with an organization entity may be obtained from an associated website.
For example, for an enterprise organization, some descriptive attributes may be obtained at a website such as, for example, a shopping network (https:// www.maigoo.com/brand/search /). For example, for an enterprise organization such as "sony (china) limited company", a description of "sony starts from japan in 1946 and a large comprehensive cross-country group known worldwide is obtained through a purchase web site, so that the product is popular in the country such as" sony "and the like, and the product produced and sold includes descriptions of color tv, notebook computer, sound, digital camera, video camera, projector and the like, and thus, the product information such as" information technology field "can be determined.
For example, for a marketed enterprise organization, some descriptive attributes may be obtained at related websites such as the Oriental financial network (http:// data. Eastmanney. Com/gstc /).
Organizations may be divided into enterprise organizations and social organizations. The following situations exist:
(1) The names of the enterprise organizations are relatively similar, but in practice the enterprise organizations of different names differ, for example, in branding, products, areas involved (which are understood herein as areas of business), etc. For example, the corresponding function information of different social organization is different.
(2) Due to pen errors or other reasons, organization names with wrongly written words, especially wrongly written words with correct pronunciation but incorrect characters, may appear, and in the case of subsequent entity linking according to the organization knowledge base, wrong candidate entities may be recalled, thereby reducing accuracy of entity linking.
According to the knowledge base construction method, the organization entity can be more accurately and differentially characterized through the description attribute; the corresponding organization names are wrongly written and wrongly written, the organization names can be provided with fault tolerance through pronunciation attributes, and therefore the determined organization entity is richer in expression, more differentiated and more accurate, and the organization name is suitable for the scene of an organization, and further the determined organization knowledge base is richer, and can be used for accurately linking entities related to the organization.
Illustratively, according to a knowledge base construction method of another embodiment of the present disclosure, a specific example of constructing an organization knowledge base from organization entities may be implemented, for example, using the following embodiments: the organization entities are structured to determine category information characterizing organization categories. Clustering the category information to determine at least one organization category cluster. And determining a knowledge dictionary of the organization category according to the organization category cluster. And constructing an organization knowledge base according to the knowledge dictionary and the organization entity.
The organization entity includes an organization name including category information.
A knowledge dictionary is understood to be a dictionary of knowledge characterized by a dictionary, where the "dictionary" may be stored in the form of a data table in an electronic device such as a computer.
Since the organization entity includes an organization name, the organization name includes category information, for example, category information for characterizing an organization category may be determined by performing a structural parsing process on the organization name, so as to implement structuring of the organization entity, and a specific example of the category information for characterizing the organization category may be determined.
The organization name may be obtained, for example, by following a naming rule, and then the organization name may be structurally parsed according to the naming rule to obtain a word component of the organization name, where the word component may include, for example, location name information, word size information, industry information, and category information.
According to the knowledge base construction method of the embodiment of the disclosure, category information for representing the category of the organization is determined by structuring the organization entity. The category information may characterize the functionality of the organization, etc., and by clustering the category information, organization entities that are similar in functionality, for example, may be clustered together. According to the organization category clusters, the knowledge dictionary of the determined organization category characterizes related knowledge of the organization after induction and arrangement, and the organization knowledge base constructed according to the knowledge dictionary and the organization entity is also richer, so that the accuracy and efficiency of the subsequent entity link through the organization knowledge base are higher.
For example, the knowledge dictionary and organization entities may be stored in an organization knowledge base to build an organization knowledge base, for example.
By way of example, the category information may be clustered using various clustering algorithms, such as density clustering algorithms, hierarchical clustering algorithms, and the like, to determine at least one organization category cluster.
Illustratively, according to a knowledge base construction method of a further embodiment of the present disclosure, the knowledge dictionary may comprise a same category dictionary. Specific examples of knowledge dictionaries that determine organization categories from an organization category cluster may be implemented, for example, using the following embodiments: and determining the same category information according to the category information associated with the same organization category cluster. And determining the same-category dictionary according to the same-category information.
It will be appreciated that for the same organization or class of organizations, there are situations where the expression for the organization is inconsistent due to the different expression habits of each individual.
For example, taking the same category dictionary as a data table, the data table may include at least one category field, each organization category cluster corresponds to a respective category field, and for any organization category cluster, the related category information is mapped to the category field corresponding to the organization category cluster to obtain the same category information. The fields corresponding to the same category information form a same category dictionary.
According to the knowledge base construction method, the same category information representing the same category organization can be determined according to the category information related to the same organization category cluster, the same category dictionary is determined according to the same category information, the same category dictionary is obtained by summarizing and sorting the category information of the organization, and the organization category cluster aiming at the same category can cover more organization with similar functions. Entity linking based on the co-category dictionary can be accurately matched to a greater number and relevance of candidate entities.
Illustratively, "" saleshouse "", "", and "nursing home" are two organization names, respectively, and by structuring the organization names, two category information representing organization categories, "saleshouse" and "nursing home" can be obtained, and it can be understood that "saleshouse" and "nursing home" are relatively similar, and by clustering the category information, the "saleshouse" and "nursing home" can be determined as category information associated with the same organization category cluster.
Illustratively, according to a knowledge base construction method of a further embodiment of the present disclosure, the knowledge dictionary may further include an abbreviated dictionary. Specific examples of knowledge dictionaries that determine organization categories from an organization category cluster may be implemented, for example, using the following embodiments: aiming at any organization category cluster, determining an abbreviated set of each organization according to an analysis result of the organization name; and determining an abbreviation dictionary of the organization category according to the abbreviation set of each organization.
For the same organization or the organization of the same category, the expression habit of each person is different, and there are cases such as inconsistent expression for the organization. For example, there are cases where an abbreviated expression organization is used.
According to the knowledge base construction method of the embodiment of the disclosure, for any organization category cluster, by determining an abbreviated set of each organization according to an analysis result of an organization name and determining an abbreviated dictionary of an organization category according to the abbreviated set of each organization, related knowledge of the organization can be collated and summarized from abbreviated dimensions of the organization, and entity linking can be accurately matched to more numbers and more related candidate entities based on the abbreviated dictionary.
For example, the input text of entity links includes short names of "city administration", "rule self bureau", and the like, and the full names of corresponding organization entities are respectively "city administration committee" and "planning and natural resource bureau".
For example, each organization name in the abbreviated collection may also be mapped with an organization name holonomy, for example, where the organization name holonomy may be mapped through the organization name acronym for use in subsequent entity linking.
Illustratively, according to the knowledge base construction method of the further embodiment of the present disclosure, for example, the following embodiment may be used to implement a specific example of determining, for any one organization category cluster, a set of short abbreviations of each organization according to a resolution result of an organization name: and determining an abbreviated set of each organization according to the analysis result of the organization name and the abbreviated combination rule aiming at any organization category cluster.
The composition rule for short characterizes a rule for generating an organization name for short by a component part of the organization name, wherein the component part of the organization name is obtained by analyzing the organization name.
Still taking one of the above "city management committees" as an example, the organization name "city management committee" is obtained, for example, by following a naming rule, and after the organization name is structurally parsed, a parsing result including, for example, "city", "management" and "committee" is obtained, for example, the first character of "city", "management" and "committee" may be taken and connected to obtain the abbreviation "city management committee".
Illustratively, the knowledge base construction method according to a further embodiment of the present disclosure, the knowledge dictionary further comprises, for example, a character mapping dictionary for characterizing a mapping relationship between character information and organization entities.
It will be appreciated that organizations may be divided into enterprise organizations and social organizations, for example, for a social organization, a particular persona may take on the corresponding role of the social organization, and thus may be accurately matched to a corresponding organization entity based on persona information. For example, the character information "police" is associated with the social organization "public security office".
According to the knowledge construction method, through the character mapping dictionary which characterizes the mapping relation between the character information and the organization entities, the corresponding organization entities can be accurately matched according to the character information according to the character mapping dictionary.
For example, the candidate entity may be recalled accurately in the event of subsequent entity linking from the organization knowledge base.
Illustratively, the people mapping dictionary may be managed and maintained, for example, by the relevant personnel.
The knowledge dictionary illustratively also includes an alias dictionary, for example.
The alias dictionary may also be managed and maintained, for example, by the relevant personnel, and includes organization name aliases.
Illustratively, the alias may include, for example, a history name or the like. For example, "OPPO guangdong mobile communication limited" has a history name of "eastern guangdong european mobile communication limited". Thus, for example, in the case of subsequent entity linking, the candidate entity can be accurately recalled according to the alias.
For example, the organization name may be input into a sequence labeling model, and the organization name may be parsed by the sequence labeling model to obtain a word component of the organization name. The organization name alias, organization name abbreviation, and the like may be determined from the word composition of the organization name. The input of the sequence labeling model is an input sequence with the length of N, and each element in the input sequence is labeled to obtain a label with the length of N, wherein the label comprises a place name label, a word size label and the like.
The model structure of the sequence annotation model may include, for example, a two-way long and short term memory network (bilstm) and a conditional random field network (crf), for example.
The embodiment of the present disclosure further provides an entity linking method, and the entity linking method according to an exemplary embodiment of the present disclosure is described below with reference to fig. 4 to 5 in conjunction with the system architecture of fig. 2. The entity linking method of the embodiments of the present disclosure may be performed by the second server 206 shown in fig. 2, for example.
Fig. 4 schematically illustrates a flow chart of an entity linking method according to an embodiment of the present disclosure.
As shown in fig. 4, the entity linking method 400 of the embodiment of the present disclosure may include, for example, operations S410 to S430.
In operation S410, a reference text in the input text is determined.
In the case of entity chains, the initial data may be input text or reference text, which may be understood as a description, i.e. the specific content mentioned in the initial data.
In operation S420, candidate entities associated with the reference text are determined from the organization knowledge base based on the reference text.
In operation S430, a target entity linked with the reference text is determined according to the candidate entity.
The organization knowledge base is constructed by the following operations: determining an organization entity according to the organization data, wherein the organization entity has a custom attribute, and the custom attribute characterizes organization characteristics of the organization entity; and constructing an organization knowledge base according to the organization entity.
It should be noted that, the organization knowledge base is obtained according to the knowledge base construction method, and the construction of the organization knowledge base is described in detail in the above embodiments, which is not described herein.
The entity linking method according to the embodiment of the present disclosure is obtained by adopting the knowledge base construction method of the above embodiment, and since the organization entity has a custom attribute, the custom attribute characterizes the organization feature of the organization entity, so that the attribute of the organization entity and the organization knowledge base constructed according to the organization entity are also richer. The number and accuracy of candidate entities determined from the organization knowledge base according to the reference text are also higher, and further the target entity accurately linked with the reference text can be determined.
Illustratively, according to an entity linking method of another embodiment of the present disclosure, the candidate entity includes a first candidate entity. Specific examples of determining candidate entities from the organization knowledge base that are related to the reference text may be implemented, for example, in accordance with the reference text using the following embodiments: determining address information corresponding to the reference text according to the input text corresponding to the reference text and the analysis result of the reference text; and matching the address information corresponding to the reference text with the organization entities in the organization knowledge base to determine a first candidate entity.
The custom attributes of the organization entities include descriptive attributes that characterize address information associated with the organization entities.
Especially for social organizations, which typically have address prefixes, practical expressions may be used to omit address information. For example, in some cases, the "a city educational bureau" or "educational bureau" will be characterized as the "a city educational bureau" or "educational bureau" where a large number of candidate entities may be recalled from the organization knowledge base, but the recalled candidate entities are not accurate.
For the case that the initial data is the input text, the input text may express complete semantics, for example, the input text may include address information, and the address information is associated with an organization corresponding to the reference text with a high probability.
The open domain information extraction model extracts information in the input text including, but not limited to, named entities, relationships, event arguments, event description fragments, ratings dimensions, perspective words, emotional tendency. The open and information extraction model may include, for example, a UIE model.
By way of example, address information in the input text may be extracted, for example, by an open domain information extraction model.
Illustratively, according to the entity linking method of the further embodiment of the present disclosure, for example, the following embodiment may be used to implement matching address information corresponding to the reference text with the organization entities in the organization knowledge base, to determine a specific example of the first candidate entity: and matching the address information corresponding to the reference text with the organization entity in the organization knowledge base, and obtaining a matching result of each hierarchical address based on each hierarchical address of the address hierarchical structure. And determining a first candidate entity according to the matching result of each hierarchical address.
The address hierarchy includes an administrative division address that includes a plurality of hierarchy addresses. For example, the administrative division address may include: province-city-county/district-county/town-village, direct administration city-county/district-county/town-village.
According to the entity linking method disclosed by the embodiment of the disclosure, by matching the address information corresponding to the reference text with the organization entities in the organization knowledge base and based on each level address of the address level organization, the accurate matching of the address information based on each level address can be realized, and therefore, the accurate first candidate entity can also be determined.
By way of example, specific examples of determining candidate entities from the organization knowledge base that are related to the reference text may be implemented in accordance with the reference text using, for example, the following embodiments: and carrying out entity category identification on the reference text to obtain the entity category related to the reference text. And determining candidate entities related to the reference text according to the correlation between the entity category of the reference text and the entity category of the organization entity.
It should be noted that, the "entity category" may be understood as a category of an entity, and the "category information characterizing the organization category" mentioned in the above embodiments is a category of an organization, for example, the category information of the organization may include "public security bureau", "education bureau" and the like, and the entity category may include terms, living things, diet, organization and the like. The entity category of the organization entity is the organization.
It can be appreciated that the entities of the same category have high correlation, and for accurate and efficient entity linking, for example, entity linking may be performed with respect to the same entity category. For example, in the case where the entity class of the reference text is an organization, the reference text may no longer be matched to non-organization entities, or in the case where the entity class of the reference text is not an organization, the reference text may no longer be matched to organization entities of an organization knowledge base, for example.
For example, where the entity categories of the reference text are "term entity", "biological entity", etc. different entity categories from the organization entity, entity links may no longer be made from the organization knowledge base.
According to the entity linking method, the entity category of the reference text is identified, the entity with the same category or the related category can be screened according to the entity category, the entity with higher correlation can be represented by the entity with the same category or the related category, and the candidate entity with high correlation with the reference text can be determined according to the correlation between the entity category of the reference text and the entity category of the organization entity, so that the accuracy of entity linking is improved.
Illustratively, according to an entity linking method of a further embodiment of the disclosure, the candidate entity comprises a second candidate entity. Specific examples of determining candidate entities from the organization knowledge base that are related to the reference text may be implemented, for example, in accordance with the reference text using the following embodiments: and determining the organization name abbreviation corresponding to the reference text according to the analysis result of the reference text. And determining candidate abbreviations according to the organization name abbreviations and the abbreviation dictionary corresponding to the reference text. And matching the candidate abbreviations with the organization entities in the organization knowledge base to determine a second candidate entity.
The custom attributes of the organization entities include description attributes that are also used to characterize organization name acronyms associated with the organization entities.
For example, the organization name abbreviation corresponding to the reference text may be determined according to the parsing result of the reference text and the abbreviation combination rule.
For example, the organization name abbreviation of the reference text may be compared with a dictionary of abbreviations, and at least one abbreviation with higher relevance may be determined as a candidate abbreviation.
According to the entity linking method of the embodiment of the disclosure, the knowledge dictionary of the organization knowledge base comprises an abbreviation dictionary, candidate abbreviations are determined according to the organization name abbreviation and the abbreviation dictionary, and the entity recall can be carried out from the dimension of the organization name abbreviation by matching the candidate abbreviation with the organization entity in the organization knowledge base, so that the accuracy of entity linking is improved, and the method is suitable for a scene of frequently using the name abbreviation of an organization.
For example, the candidate alias may also be determined from the organization name alias and the alias dictionary corresponding to the reference text. And matching the candidate aliases with the organization entities in the organization knowledge base to determine candidate entities. The custom attributes of the organization entities include description attributes that are also used to characterize organization name aliases associated with the organization entities.
Illustratively, according to an entity linking method of a further embodiment of the present disclosure, the candidate entity comprises a third candidate entity. Specific examples of determining candidate entities from the organization knowledge base that are related to the reference text may be implemented, for example, in accordance with the reference text using the following embodiments: and determining the organization name holonomy corresponding to the reference text according to the analysis result of the reference text. And according to the organization name corresponding to the reference text, matching with the organization entities in the organization knowledge base, and determining a third candidate entity.
The custom attributes of the organization entities include descriptive attributes that are also used to characterize the organization name universe to which the organization entities relate.
For example, an organization name abbreviation corresponding to the reference text may be determined according to the parsing result of the reference text; and determining the corresponding organization name full name according to the organization name short and the name mapping relation. The name mapping relation, for example, characterizes the mapping relation between the organization name abbreviation and the organization name holonomy.
According to the entity linking method, the organization name holonomy corresponding to the reference text is determined according to the analysis result of the reference text. And matching the organization name holonomy corresponding to the reference text with the organization entities in the organization knowledge base to determine a third candidate entity, and carrying out entity recall from the dimension of the organization name holonomy to improve the accuracy of entity link.
Illustratively, according to an entity linking method of a further embodiment of the disclosure, the candidate entity comprises a fourth candidate entity. Specific examples of determining candidate entities from the organization knowledge base that are related to the reference text may be implemented, for example, in accordance with the reference text using the following embodiments: and determining at least one of function information, brand information, product information and field information corresponding to the reference text according to the analysis result of the reference text. And according to at least one of the function information, the brand information, the product information and the field information corresponding to the reference text, matching with the organization entities in the organization knowledge base, and determining a fourth candidate entity.
The reference text may comprise, for example, relevant descriptive content of the organization entity, so that the reference text may also comprise, for example, at least one of function information characterizing the function, brand information characterizing the brand, product information characterizing the product, domain information characterizing the domain.
The custom attributes of the organization entities include descriptive attributes that are further used to characterize at least one of functional information, branding information, product information, and domain information related to the organization entities.
According to the entity linking method disclosed by the embodiment of the disclosure, the entity recall can be performed from the dimension of the custom attribute such as the function information, the brand information, the product information, the field information and the like through the operation, and the accuracy of the entity linking can be improved through the richer custom attribute of the entity.
Illustratively, according to an entity linking method of a further embodiment of the present disclosure, the candidate entity comprises a fifth candidate entity. Specific examples of determining candidate entities from the organization knowledge base that are related to the reference text may be implemented, for example, in accordance with the reference text using the following embodiments: and determining pronunciation information corresponding to the reference text according to the analysis result of the reference text. And according to the pronunciation information corresponding to the reference text, matching with the organization entities in the organization knowledge base, and determining a fifth candidate entity.
The custom attributes of the organization entities include a pronunciation attribute that characterizes the pronunciation of the organization name.
According to the entity linking method disclosed by the embodiment of the disclosure, the entity recall can be performed from the dimension of the pronunciation attribute through the operation, and the accuracy of the entity link can be improved through the richer custom attribute of the entity.
Illustratively, considering that the name of the organization entity includes a plurality of information such as address information, word size information, category information, etc., for example, the word size (word size is a substantial name) of the information is more important for accurately matching the entity, for example, the word size information of the reference text may also be determined by the analysis result of the reference text, and the fifth candidate entity may be determined by matching the pronunciation information of the word size information corresponding to the reference text with the organization entity in the organization knowledge base. Correspondingly, in the stage of constructing the organization knowledge base, the pronunciation attribute of the organization entity may be, for example, pronunciation information of word size information of the organization entity.
Illustratively, according to an entity linking method of a further embodiment of the disclosure, the candidate entity comprises a sixth candidate entity. Specific examples of determining candidate entities from the organization knowledge base that are related to the reference text may be implemented, for example, in accordance with the reference text using the following embodiments: according to the analysis result of the reference text, determining character information corresponding to the reference text; according to character information corresponding to the reference text and the character mapping dictionary, candidate character information is determined; and matching the candidate character information with the organization entities in the organization knowledge base to determine a sixth candidate entity.
The custom attributes of the organization entities include descriptive attributes that are also used to characterize persona information associated with the organization entities.
According to the entity linking method of the embodiment of the disclosure, a knowledge dictionary of an organization knowledge base comprises a character mapping dictionary, and candidate character information is determined according to character information corresponding to a reference text and the character mapping dictionary; and matching the candidate character information with the organization entity in the organization knowledge base to determine a sixth candidate entity, and carrying out entity recall from the dimension of the character information, thereby improving the accuracy of entity link and being suitable for the scene of the character information associated with the organization.
Illustratively, persona information in the reference text may be determined, for example, by the chinese word class knowledge tagging tool WordTag.
Illustratively, according to an entity linking method of a further embodiment of the present disclosure, determining a specific example of a target entity linked to the reference text according to a correlation between the candidate entity and the reference text may be implemented, for example, using the following embodiments: and determining the correlation evaluation value of the candidate entity according to the reference evaluation value and the weight of the candidate entity. And determining the target entity from the candidate entities according to the correlation evaluation values of the candidate entities.
The candidate entities include at least one of a first candidate entity, a second candidate entity, a third candidate entity, a fourth candidate entity, a fifth candidate entity, and a sixth candidate entity.
For example, the reference evaluation value and the weight may be predetermined, for example. The reference evaluation value and the weight can also be adjusted, for example.
According to the entity linking method, the target entity which is more relevant to the reference text can be accurately determined from the candidate entities by determining the relevance evaluation value of the candidate entities according to the reference evaluation value and the weight of the candidate entities and according to the relevance evaluation value of the candidate entities, and the requirements under different scenes can be met through the adjustable reference evaluation value and the adjustable weight, so that the method has higher flexibility.
For example, at least one of the first candidate entity, the second candidate entity, the third candidate entity, the fourth candidate entity, the fifth candidate entity and the sixth candidate entity may be converted into word vectors and spliced to obtain candidate entity word vectors, and the candidate entity word vectors may be input into a ranking model to obtain target vectors.
The ranking model may be, for example, a similarity matching model, a classification model, a multi-classification model, etc. (e.g., as shown in fig. 1B-1C).
Illustratively, according to the entity linking method of the further embodiment of the present disclosure, a specific example of determining the correlation evaluation value of the candidate entity according to the reference evaluation value and the weight of the candidate entity may be implemented, for example, using the following embodiments: and analyzing the candidate entity to obtain the word composition of the candidate entity. And determining the correlation evaluation value of the candidate entity according to the reference evaluation value and the weight of the word composition for any candidate entity.
For example, the organization name of the candidate entity may be input into a sequence labeling model, and the organization name of the candidate entity may be parsed by the sequence labeling model to obtain the word components of the organization name of the candidate entity.
According to the entity linking method disclosed by the embodiment of the invention, the word forming components with finer granularity of the candidate entity can be obtained by analyzing the candidate entity. For any candidate entity, according to the reference evaluation value and the weight of the word composition, the determined correlation evaluation value of the candidate entity is more accurate, and the target entity determined according to the candidate entity is more accurate, namely the accuracy and the efficiency of entity link are higher.
Fig. 5 schematically illustrates a schematic diagram of an entity linking method according to yet another embodiment of the present disclosure.
As shown in fig. 5, the input text is "city Wei Jianwei: the XX epidemic situation is overall stable, and all recent infected persons are people with history of foreign province, city and residence and related people I epidemic situation prevention and control release meeting. The reference text segment in the input text is assigned to city Wei Jian. By performing entity class recognition (namely, "concept recognition" in fig. 5 is "entity class recognition") on the reference text, the entity class of the reference text can be determined to be an organization (the entity class of the organization is characterized by "organization class" in fig. 5), and the address information related to the introduced text can be determined to be "XX city" according to the introduced text and the address information in the input text. Corresponding organization names are also determined to be abbreviated as 'city Wei Jian commission' according to the analysis result of the reference text.
In the stage of determining candidate entities, the matching of descriptive attributes such as brand information, product information and the like can be realized through address matching, similar word dictionary matching, organization name full name matching, organization name short name matching, pronunciation attribute matching, character information matching and the like. Whereby the candidate entity may be recalled. In the example of fig. 5, the persona information is not parsed from the reference text of "city Wei Jianwei", and therefore the reference text is not compared to the persona mapping dictionary.
In the determining the target entity stage, for example, the relevance evaluation value of each candidate entity may be ranked, for example, the candidate entity ranked first and/or the relevance evaluation value being greater than a predetermined threshold may be regarded as the target entity. If the relevance scores of all candidate entities are less than the predetermined threshold, then it may be considered that the current reference text cannot be linked from the organization knowledge base to the target entity. In the example of fig. 5, the constituent word corresponding to the cited text includes "Wei Jian delegation" and address information "XX city", the reference evaluation value of "Wei Jian delegation" is 3, the reference evaluation value of the address information "XX city" is 2, and the weights of both may be the same, for example.
In the example of fig. 5, links may be made from the organization knowledge base to a target entity, which is the "XX city health committee".
In the example of FIG. 5, an organization knowledge base is characterized using an "offline knowledge base," which includes, for example, organization entities, knowledge dictionaries including classwise dictionaries (i.e., type synonym dictionaries), personally mapped dictionaries (i.e., special persona-organization mapping tables), short dictionaries (i.e., government organization short dictionaries).
Fig. 6 schematically shows a block diagram of a knowledge base construction apparatus, according to an embodiment of the present disclosure.
As shown in fig. 6, the knowledge base construction apparatus 600 of the embodiment of the disclosure includes, for example, an organization entity determination module 610, a knowledge base construction module 620.
The organization entity determining module 610 is configured to determine an organization entity according to organization data, where the organization entity has a custom attribute, and the custom attribute characterizes an organization feature of the organization entity.
The knowledge base construction module 620 is configured to construct an organization knowledge base according to the organization entity.
Illustratively, the knowledge base construction module includes: the category information determining submodule is used for structuring the organization entity and determining category information used for representing the category of the organization, wherein the organization entity comprises an organization name, and the organization name comprises the category information; the category cluster determining sub-module is used for clustering category information and determining at least one organization category cluster; the knowledge dictionary determining submodule is used for determining a knowledge dictionary of the organization category according to the organization category cluster; and the knowledge base determining submodule is used for constructing an organization knowledge base according to the knowledge dictionary and the organization entity.
Illustratively, the knowledge dictionary determining submodule includes: the same category information determining unit is used for determining same category information according to category information related to the same organization category cluster; and a same-category dictionary determining unit configured to determine a same-category dictionary based on the same-category information.
Illustratively, the knowledge dictionary further includes an abbreviation dictionary; the knowledge dictionary determining submodule includes: the short-term set determining unit is used for determining the short-term set of each organization according to the analysis result of the organization name aiming at any organization category cluster; and an abbreviation dictionary determining unit for determining an abbreviation dictionary of the organization category according to the abbreviation set of each organization.
Illustratively, the short set determination unit includes: the short-term set determining subunit is configured to determine, for any one organization category cluster, a short-term set of each organization according to an analysis result of an organization name and a short-term combination rule, where the short-term combination rule characterizes a rule for generating an organization name short term by a component of the organization name, and the component of the organization name is obtained by analysis of the organization name.
Illustratively, the knowledge dictionary further comprises a persona mapping dictionary for characterizing mappings between persona information and organizational entities.
Illustratively, the custom attributes include at least one of descriptive attributes, pronunciation attributes; the descriptive attribute is used for representing at least one of function information, brand information, product information, field information, address information, organization name abbreviation, organization name full scale and character information related to the organization entity; the pronunciation attribute is used to characterize the pronunciation of the organization name.
Fig. 7 schematically illustrates a block diagram of an entity linking apparatus according to an embodiment of the present disclosure.
As shown in fig. 7, the entity linking apparatus 700 of the embodiment of the present disclosure includes, for example, a reference text determination module 710, a candidate entity determination module 720, and a target entity determination module 730.
The reference text determination module 710 is configured to determine a reference text in the input text.
A candidate entity determination module 720, configured to determine, from the organization knowledge base, a candidate entity related to the reference text according to the reference text.
The target entity determining module 730 is configured to determine a target entity linked to the reference text according to the candidate entity.
The organization knowledge base is constructed by using the following modules: the knowledge base construction module is used for the organization entity determination module to determine the organization entity according to the organization data, wherein the organization entity has a custom attribute, and the custom attribute characterizes the organization characteristics of the organization entity; and constructing an organization knowledge base according to the organization entity.
Illustratively, the candidate entity comprises a first candidate entity; the candidate entity determination module includes: the address information determining submodule is used for determining address information corresponding to the reference text according to the input text corresponding to the reference text and the analysis result of the reference text; and the first candidate entity determining submodule is used for matching the address information corresponding to the reference text with the organization mechanism entities in the organization mechanism knowledge base to determine the first candidate entity, wherein the custom attribute of the organization mechanism entity comprises a description attribute, and the description attribute is used for representing the address information related to the organization mechanism entity.
Illustratively, the first candidate entity determination submodule includes: the hierarchy matching unit is used for matching the address information corresponding to the reference text with the organization entity in the organization knowledge base, and obtaining a matching result of each hierarchy address based on each hierarchy address of the address hierarchy structure; and a first candidate entity determining unit configured to determine a first candidate entity according to a matching result of each hierarchical address, wherein the address hierarchy includes an administrative division address including a plurality of hierarchical addresses.
Illustratively, the candidate entity determination module includes: the entity category determination submodule is used for carrying out entity category identification on the reference text to obtain an entity category related to the reference text; and a candidate entity determination sub-module for determining a candidate entity related to the reference text according to the correlation between the entity class of the reference text and the entity class of the organization entity.
Illustratively, the candidate entity determination module includes: the entity category determination submodule is used for carrying out entity category identification on the reference text to obtain an entity category related to the reference text; and a candidate entity determination sub-module for determining a candidate entity related to the reference text according to the correlation between the entity class of the reference text and the entity class of the organization entity.
Illustratively, the candidate entity comprises a second candidate entity; the candidate entity determination module includes: a determining submodule for determining the organization name abbreviation corresponding to the quotation text according to the analysis result of the quotation text; a candidate abbreviation determination submodule for determining candidate abbreviations according to the organization name abbreviations and abbreviation dictionaries corresponding to the quotation texts; and a second candidate entity determining submodule, configured to match the candidate abbreviation with an organization entity in the organization entity knowledge base, and determine a second candidate entity, where the custom attribute of the organization entity includes a description attribute, and the description attribute is further used to characterize an organization name abbreviation related to the organization entity.
Illustratively, the candidate entity comprises a third candidate entity; the candidate entity determination module includes: the full name determining sub-module is used for determining the organization name full name corresponding to the reference text according to the analysis result of the reference text; and the third candidate entity determining submodule is used for determining a third candidate entity according to matching of the organization name holonomy corresponding to the reference text with the organization entities in the organization knowledge base, wherein the custom attribute of the organization entities comprises a description attribute which is also used for representing the organization name holonomy related to the organization entities.
Illustratively, the candidate entity comprises a fourth candidate entity; the candidate entity determination module includes: the information determination submodule is used for determining at least one of function information, brand information, product information and field information corresponding to the reference text according to the analysis result of the reference text; and a fourth candidate entity determining sub-module, configured to determine a fourth candidate entity according to matching, by using the at least one of function information, brand information, product information, and domain information corresponding to the reference text, with an organization entity in an organization knowledge base, where a custom attribute of the organization entity includes a description attribute, and the description attribute is further used to characterize at least one of function information, brand information, product information, and domain information related to the organization entity.
Illustratively, the candidate entity comprises a fifth candidate entity; the candidate entity determination module includes: the pronunciation information determination submodule is used for determining pronunciation information corresponding to the reference text according to the analysis result of the reference text; and a fifth candidate entity determining sub-module, configured to match the pronunciation information corresponding to the reference text with the organization entity in the organization knowledge base, and determine a fifth candidate entity, where the custom attribute of the organization entity includes a pronunciation attribute, and the pronunciation attribute is used to characterize a pronunciation of an organization name.
Illustratively, the candidate entity comprises a sixth candidate entity; the candidate entity determination module includes: the character information determining submodule is used for determining character information corresponding to the reference text according to the analysis result of the reference text; the candidate character information determining submodule is used for determining candidate character information according to character information corresponding to the reference text and the character mapping dictionary; and a sixth candidate entity determining submodule, configured to match the candidate personage information with an organization entity in the organization entity knowledge base, and determine a sixth candidate entity, where the custom attribute of the organization entity includes a description attribute, and the description attribute is further used to characterize personage information related to the organization entity.
Illustratively, the target entity determination module includes: the correlation evaluation value determination submodule is used for determining a correlation evaluation value of the candidate entity according to the reference evaluation value and the weight of the candidate entity; and a target entity determination submodule, configured to determine a target entity from candidate entities according to a relevance evaluation value of the candidate entities, where the candidate entities include at least one of a first candidate entity, a second candidate entity, a third candidate entity, a fourth candidate entity, a fifth candidate entity, and a sixth candidate entity.
Illustratively, the correlation evaluation value determination submodule includes: the word composition determining unit is used for analyzing the candidate entity to obtain the word composition of the candidate entity; and a correlation evaluation value determination unit configured to determine, for any one of the candidate entities, a correlation evaluation value of the candidate entity based on the reference evaluation value and the weight of the word composition.
It should be understood that the embodiments of the apparatus portion of the present disclosure correspond to the same or similar embodiments of the method portion of the present disclosure, and the technical problems to be solved and the technical effects to be achieved also correspond to the same or similar embodiments, which are not described herein in detail.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 8 illustrates a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The computing unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.
Various components in device 800 are connected to I/O interface 805, including: an input unit 806 such as a keyboard, mouse, etc.; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, etc.; and a communication unit 809, such as a network card, modem, wireless communication transceiver, or the like. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 801 performs the respective methods and processes described above, such as a knowledge base construction method, an entity linking method. For example, in some embodiments, the knowledge base construction method, the entity linking method, may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the knowledge base construction method, the entity linking method, and the like described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the knowledge base construction method, the entity linking method, in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (34)

1. An entity linking method, comprising:
determining a reference text in the input text;
determining candidate entities related to the reference text from an organization knowledge base according to the reference text;
determining a target entity linked with the reference text according to the candidate entity,
the organization knowledge base is constructed by the following operations:
determining an organization entity according to organization data, wherein the organization entity has a custom attribute, and the custom attribute characterizes organization characteristics of the organization entity, wherein the custom attribute comprises a pronunciation attribute, and the pronunciation attribute is used for characterizing pronunciation of an organization name; and
constructing an organization knowledge base according to the organization entities,
Wherein said determining, from the organization knowledge base, candidate entities related to the reference text based on the reference text comprises:
according to the analysis result of the reference text, determining pronunciation information of word size information of an organization entity corresponding to the reference text, wherein the organization entity comprises an organization name, the word size information is from word formation components of the organization name, and the word formation components are obtained by carrying out structural analysis on the organization name according to naming rules; and
and according to the pronunciation information of the word size information of the organization entity corresponding to the reference text, matching with the organization entity in the organization knowledge base, and determining a fifth candidate entity.
2. The method of claim 1, wherein the candidate entity comprises a first candidate entity; the determining, from the organization knowledge base, candidate entities related to the reference text according to the reference text includes:
determining address information corresponding to the reference text according to the input text corresponding to the reference text and the analysis result of the reference text; and
and matching the address information corresponding to the reference text with the organization entity in the organization mechanism knowledge base, and determining the first candidate entity, wherein the custom attribute of the organization entity comprises a description attribute, and the description attribute is used for representing the address information related to the organization entity.
3. The method of claim 2, wherein the matching the address information corresponding to the reference text with the organization entities in the organization repository, determining the first candidate entity comprises:
matching the address information corresponding to the reference text with the organization entities in the organization knowledge base based on each level address of an address hierarchy structure to obtain a matching result of each level address; and
and determining the first candidate entity according to the matching result of each hierarchical address, wherein the address hierarchical structure comprises an administrative division address, and the administrative division address comprises a plurality of hierarchical addresses.
4. The method of claim 1, wherein said determining, from the organization knowledge base, candidate entities related to the reference text from the reference text comprises:
performing entity category identification on the reference text to obtain an entity category related to the reference text; and
and determining candidate entities related to the reference text according to the correlation between the entity category of the reference text and the entity category of the organization entity.
5. The method of any of claims 1-4, wherein the candidate entity comprises a second candidate entity; the determining, from the organization knowledge base, candidate entities related to the reference text according to the reference text includes:
determining an organization name abbreviation corresponding to the reference text according to the analysis result of the reference text;
determining candidate abbreviations according to the organization name abbreviations and the abbreviation dictionary corresponding to the quotation texts; and
and matching the candidate abbreviations with the organization mechanism entities in the organization mechanism knowledge base to determine the second candidate entity, wherein the custom attributes of the organization mechanism entities comprise description attributes which are also used for representing organization mechanism name abbreviations related to the organization mechanism entities.
6. The method of any of claims 1-4, wherein the candidate entity comprises a third candidate entity; the determining, from the organization knowledge base, candidate entities related to the reference text according to the reference text includes:
determining the organization name holonomy corresponding to the reference text according to the analysis result of the reference text; and
And according to matching of the organization name holonomy corresponding to the reference text and the organization entities in the organization knowledge base, determining the third candidate entity, wherein the custom attribute of the organization entity comprises a description attribute which is also used for representing the organization name holonomy related to the organization entity.
7. The method of any of claims 1-4, wherein the candidate entity comprises a fourth candidate entity; the determining, from the organization knowledge base, candidate entities related to the reference text according to the reference text includes:
determining at least one of function information, brand information, product information and field information corresponding to the reference text according to the analysis result of the reference text; and
and determining the fourth candidate entity according to matching of at least one of function information, brand information, product information and field information corresponding to the reference text with organization entities in the organization knowledge base, wherein the custom attribute of the organization entities comprises a description attribute which is also used for characterizing at least one of function information, brand information, product information and field information related to the organization entities.
8. The method of any of claims 1-4, wherein the candidate entity comprises a sixth candidate entity; the determining, from the organization knowledge base, candidate entities related to the reference text according to the reference text includes:
according to the analysis result of the reference text, determining character information corresponding to the reference text;
determining candidate character information according to the character information corresponding to the reference text and the character mapping dictionary; and
and matching the candidate character information with the organization entity in the organization knowledge base to determine the sixth candidate entity, wherein the custom attribute of the organization entity comprises a description attribute which is also used for representing character information related to the organization entity.
9. The method of any of claims 1-4, wherein the determining a target entity linked to the reference text based on a correlation between the candidate entity and the reference text comprises:
determining a correlation evaluation value of the candidate entity according to the reference evaluation value and the weight of the candidate entity; and
and determining the target entity from the candidate entities according to the relevance evaluation value of the candidate entities, wherein the candidate entities comprise at least one of a first candidate entity, a second candidate entity, a third candidate entity, a fourth candidate entity, a fifth candidate entity and a sixth candidate entity.
10. The method of claim 9, wherein the determining the relevance estimate for the candidate entity based on the reference estimate and the weight for the candidate entity comprises:
analyzing the candidate entity to obtain word components of the candidate entity; and
and determining a correlation evaluation value of the candidate entity according to the reference evaluation value and the weight of the word composition aiming at any candidate entity.
11. The method of claim 1, wherein said constructing an organization knowledge base from said organization entities comprises:
structuring the organization entity, and determining category information for representing organization categories, wherein the organization entity comprises an organization name, and the organization name comprises the category information;
clustering the category information to determine at least one organization category cluster;
determining a knowledge dictionary of the organization category according to the organization category cluster; and
and constructing the organization knowledge base according to the knowledge dictionary and the organization entity.
12. The method of claim 11, wherein the knowledge dictionary comprises a same category dictionary; the determining the knowledge dictionary of the organization category according to the organization category cluster comprises:
Determining same category information according to the category information associated with the same organization category cluster; and
and determining the same-category dictionary according to the same-category information.
13. The method of claim 11, wherein the knowledge dictionary further comprises an acronym dictionary; the determining the knowledge dictionary of the organization category according to the organization category cluster comprises:
determining a short collection of each organization according to the analysis result of the organization name aiming at any organization category cluster; and
and determining an abbreviation dictionary of the organization category according to the abbreviation set of each organization.
14. The method of claim 13, wherein the determining, for any one of the organization category clusters, a set of acronyms for each organization based on the resolution of the organization name comprises:
and determining an abbreviated set of each organization according to an analysis result of the organization name and abbreviated combination rules aiming at any organization category cluster, wherein the abbreviated combination rules characterize rules for generating organization name abbreviations by components of the organization name, and the components of the organization name are obtained by analysis of the organization name.
15. The method of claim 11, wherein the knowledge dictionary further comprises a persona mapping dictionary for characterizing mappings between persona information and the organizational entities.
16. The method of any of claims 11-15, wherein the custom properties further comprise descriptive properties; the description attribute is used for representing at least one of function information, brand information, product information, field information, address information, organization name short, organization name full name and character information related to the organization entity.
17. An entity linking apparatus comprising:
the reference text determining module is used for determining reference texts in the input texts;
the candidate entity determining module is used for determining candidate entities related to the reference text from an organization knowledge base according to the reference text;
a target entity determining module, configured to determine a target entity linked to the reference text according to the candidate entity,
the organization knowledge base is constructed by the following modules:
the organization entity determining module is used for determining an organization entity according to organization data, wherein the organization entity is provided with a custom attribute, the custom attribute represents the organization characteristics of the organization entity, the custom attribute comprises a pronunciation attribute, and the pronunciation attribute is used for representing the pronunciation of the organization name; and
Constructing an organization knowledge base according to the organization entity;
wherein the candidate entity determination module comprises:
the pronunciation information determination submodule is used for determining pronunciation information of word size information of an organization entity corresponding to the reference text according to the analysis result of the reference text, wherein the organization entity comprises an organization name, the word size information is from word components of the organization name, and the word components are obtained by carrying out structural analysis on the organization name according to naming rules; and
and the fifth candidate entity determining submodule is used for determining a fifth candidate entity according to the pronunciation information of the word size information of the organization entity corresponding to the reference text and matching with the organization entity in the organization knowledge base.
18. The apparatus of claim 17, wherein the candidate entity comprises a first candidate entity; the candidate entity determination module includes:
the address information determining submodule is used for determining address information corresponding to the reference text according to the input text corresponding to the reference text and the analysis result of the reference text; and
The first candidate entity determining submodule is used for matching the address information corresponding to the reference text with the organization mechanism entities in the organization mechanism knowledge base to determine the first candidate entity, wherein the custom attribute of the organization mechanism entity comprises a description attribute, and the description attribute is used for representing the address information related to the organization mechanism entity.
19. The apparatus of claim 18, wherein the first candidate entity determination submodule comprises:
the hierarchy matching unit is used for matching the address information corresponding to the reference text with the organization entities in the organization knowledge base, and obtaining a matching result of each hierarchy address based on each hierarchy address of the address hierarchy structure; and
and the first candidate entity determining unit is used for determining the first candidate entity according to the matching result of each hierarchical address, wherein the address hierarchical structure comprises an administrative division address, and the administrative division address comprises a plurality of hierarchical addresses.
20. The apparatus of claim 17, wherein the candidate entity determination module comprises:
The entity category determining sub-module is used for carrying out entity category identification on the reference text to obtain an entity category related to the reference text; and
and the candidate entity determining submodule is used for determining candidate entities related to the reference text according to the correlation between the entity category of the reference text and the entity category of the organization entity.
21. The apparatus of any of claims 17-20, wherein the candidate entity comprises a second candidate entity; the candidate entity determination module includes:
a determining submodule for determining the organization name abbreviation corresponding to the quotation text according to the analysis result of the quotation text;
a candidate abbreviation determining sub-module, configured to determine candidate abbreviations according to the organization name abbreviations and abbreviation dictionaries corresponding to the cited texts; and
and the second candidate entity determining submodule is used for matching the candidate abbreviation with the organization mechanism entity in the organization mechanism knowledge base to determine the second candidate entity, wherein the custom attribute of the organization mechanism entity comprises a description attribute, and the description attribute is also used for representing the organization mechanism name abbreviation related to the organization mechanism entity.
22. The apparatus of any of claims 17-20, wherein the candidate entity comprises a third candidate entity; the candidate entity determination module includes:
the full name determining sub-module is used for determining the organization name full name corresponding to the reference text according to the analysis result of the reference text; and
and the third candidate entity determining submodule is used for determining the third candidate entity according to matching of the organization name holonomy corresponding to the reference text with the organization entities in the organization knowledge base, wherein the custom attribute of the organization entity comprises a description attribute which is also used for representing the organization name holonomy related to the organization entity.
23. The apparatus of any of claims 17-20, wherein the candidate entity comprises a fourth candidate entity; the candidate entity determination module includes:
the information determination submodule is used for determining at least one of function information, brand information, product information and field information corresponding to the reference text according to the analysis result of the reference text; and
and the fourth candidate entity determining submodule is used for determining the fourth candidate entity according to matching of at least one of function information, brand information, product information and field information corresponding to the reference text with the organization entity in the organization knowledge base, wherein the custom attribute of the organization entity comprises a description attribute which is also used for representing at least one of function information, brand information, product information and field information related to the organization entity.
24. The apparatus of any of claims 17-20, wherein the candidate entity comprises a sixth candidate entity; the candidate entity determination module includes:
the character information determining submodule is used for determining character information corresponding to the reference text according to the analysis result of the reference text;
the candidate character information determining submodule is used for determining candidate character information according to the character information corresponding to the reference text and the character mapping dictionary; and
a sixth candidate entity determining submodule, configured to match the candidate persona information with an organization entity in the organization entity knowledge base, and determine the sixth candidate entity, where the custom attribute of the organization entity includes a description attribute, and the description attribute is further used to characterize persona information related to the organization entity.
25. The apparatus of any of claims 17-20, wherein the target entity determination module comprises:
a correlation evaluation value determination submodule, configured to determine a correlation evaluation value of the candidate entity according to a reference evaluation value and a weight of the candidate entity; and
a target entity determining sub-module, configured to determine the target entity from the candidate entities according to the relevance evaluation value of the candidate entities, where the candidate entities include at least one of a first candidate entity, a second candidate entity, a third candidate entity, a fourth candidate entity, a fifth candidate entity, and a sixth candidate entity.
26. The apparatus of claim 25, wherein the relevance evaluation value determination submodule includes:
the word composition determining unit is used for analyzing the candidate entity to obtain the word composition of the candidate entity; and
and the correlation evaluation value determining unit is used for determining the correlation evaluation value of the candidate entity according to the reference evaluation value and the weight of the word composition for any one candidate entity.
27. The apparatus of claim 17, wherein the knowledge base construction module comprises:
the category information determining submodule is used for structuring the organization entity and determining category information used for representing the organization category, wherein the organization entity comprises an organization name, and the organization name comprises the category information;
the category cluster determining submodule is used for clustering the category information and determining at least one organization category cluster;
the knowledge dictionary determining submodule is used for determining a knowledge dictionary of the organization category according to the organization category cluster; and
and the knowledge base determining submodule is used for constructing the organization knowledge base according to the knowledge dictionary and the organization entity.
28. The apparatus of claim 27, wherein the knowledge dictionary comprises a same category dictionary; the knowledge dictionary determining submodule includes:
the same category information determining unit is used for determining same category information according to the category information associated with the same organization category cluster; and
and the same-category dictionary determining unit is used for determining the same-category dictionary according to the same-category information.
29. The apparatus of claim 27, wherein the knowledge dictionary further comprises an acronym dictionary; the knowledge dictionary determining submodule includes:
the short-term set determining unit is used for determining short-term sets of each organization according to the analysis result of the organization names aiming at any organization category cluster; and
and the short dictionary determining unit is used for determining the short dictionary of the organization category according to the short set of each organization.
30. The apparatus of claim 29, wherein the acronym set determination unit comprises:
and the short-term set determining subunit is used for determining the short-term set of each organization according to the analysis result of the organization name and the short-term combination rule aiming at any organization category cluster, wherein the short-term combination rule characterizes the rule for generating the organization name short term by the component parts of the organization name, and the component parts of the organization name are obtained by the analysis of the organization name.
31. The apparatus of claim 27, wherein the knowledge dictionary further comprises a persona mapping dictionary for characterizing mappings between persona information and the organizational entities.
32. The apparatus of any of claims 27-31, wherein the custom properties further comprise descriptive properties; the description attribute is used for representing at least one of function information, brand information, product information, field information, address information, organization name short, organization name full name and character information related to the organization entity.
33. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-16.
34. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-16.
CN202310269188.9A 2023-03-15 2023-03-15 Knowledge base construction method, entity linking method, device and equipment Active CN116258138B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310269188.9A CN116258138B (en) 2023-03-15 2023-03-15 Knowledge base construction method, entity linking method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310269188.9A CN116258138B (en) 2023-03-15 2023-03-15 Knowledge base construction method, entity linking method, device and equipment

Publications (2)

Publication Number Publication Date
CN116258138A CN116258138A (en) 2023-06-13
CN116258138B true CN116258138B (en) 2024-01-02

Family

ID=86687899

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310269188.9A Active CN116258138B (en) 2023-03-15 2023-03-15 Knowledge base construction method, entity linking method, device and equipment

Country Status (1)

Country Link
CN (1) CN116258138B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202382A (en) * 2016-07-08 2016-12-07 南京缘长信息科技有限公司 Link instance method and system
CN108415902A (en) * 2018-02-10 2018-08-17 合肥工业大学 A kind of name entity link method based on search engine
CN111428478A (en) * 2020-03-20 2020-07-17 北京百度网讯科技有限公司 Evidence searching method, device, equipment and storage medium for term synonymy discrimination
CN112182312A (en) * 2020-09-23 2021-01-05 中国建设银行股份有限公司 Mechanism name matching method and device, electronic equipment and readable storage medium
CN114328937A (en) * 2022-03-10 2022-04-12 中国医学科学院医学信息研究所 Scientific research institution information processing method and device
CN115757689A (en) * 2022-09-21 2023-03-07 中国人民解放军军事科学院军事科学信息研究中心 Information query system, method and equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8346795B2 (en) * 2010-03-10 2013-01-01 Xerox Corporation System and method for guiding entity-based searching

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202382A (en) * 2016-07-08 2016-12-07 南京缘长信息科技有限公司 Link instance method and system
CN108415902A (en) * 2018-02-10 2018-08-17 合肥工业大学 A kind of name entity link method based on search engine
CN111428478A (en) * 2020-03-20 2020-07-17 北京百度网讯科技有限公司 Evidence searching method, device, equipment and storage medium for term synonymy discrimination
CN112182312A (en) * 2020-09-23 2021-01-05 中国建设银行股份有限公司 Mechanism name matching method and device, electronic equipment and readable storage medium
CN114328937A (en) * 2022-03-10 2022-04-12 中国医学科学院医学信息研究所 Scientific research institution information processing method and device
CN115757689A (en) * 2022-09-21 2023-03-07 中国人民解放军军事科学院军事科学信息研究中心 Information query system, method and equipment

Also Published As

Publication number Publication date
CN116258138A (en) 2023-06-13

Similar Documents

Publication Publication Date Title
AU2018383346B2 (en) Domain-specific natural language understanding of customer intent in self-help
Arulmurugan et al. RETRACTED ARTICLE: Classification of sentence level sentiment analysis using cloud machine learning techniques
CN112507715A (en) Method, device, equipment and storage medium for determining incidence relation between entities
EP4113357A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN110737774A (en) Book knowledge graph construction method, book recommendation method, device, equipment and medium
US20220121668A1 (en) Method for recommending document, electronic device and storage medium
CN111783861A (en) Data classification method, model training device and electronic equipment
CN113051380A (en) Information generation method and device, electronic equipment and storage medium
CN111966781A (en) Data query interaction method and device, electronic equipment and storage medium
CN113836316B (en) Processing method, training method, device, equipment and medium for ternary group data
CN114357951A (en) Method, device, equipment and storage medium for generating standard report
CN112597768B (en) Text auditing method, device, electronic equipment, storage medium and program product
CN114116997A (en) Knowledge question answering method, knowledge question answering device, electronic equipment and storage medium
CN111555960A (en) Method for generating information
CN114201622B (en) Method and device for acquiring event information, electronic equipment and storage medium
CN116258138B (en) Knowledge base construction method, entity linking method, device and equipment
CN112328653B (en) Data identification method, device, electronic equipment and storage medium
CN112989011B (en) Data query method, data query device and electronic equipment
CN112926297B (en) Method, apparatus, device and storage medium for processing information
CN114969371A (en) Heat sorting method and device of combined knowledge graph
CN113806541A (en) Emotion classification method and emotion classification model training method and device
CN115248890A (en) User interest portrait generation method and device, electronic equipment and storage medium
Ye et al. A natural language-based flight searching system
CN112015989A (en) Method and device for pushing information
CN115809334B (en) Training method of event relevance classification model, text processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant